com.vaadin/docs-mcp
Provides Vaadin Documentation and help with development tasks
ā
13MITother
Install
Config snippet generator goes here (5 client tabs)
README
# Vaadin Documentation RAG Service
A sophisticated, hierarchically-aware Retrieval-Augmented Generation (RAG) system for Vaadin documentation that understands document structure, provides framework-specific filtering, and enables intelligent parent-child navigation through documentation sections.
## šÆ Project Overview
This project provides an advanced RAG system with enhanced hybrid search that:
- **Understands Hierarchical Structure**: Navigates parent-child relationships within and across documentation files
- **Enhanced Hybrid Search**: Combines semantic and intelligent keyword search with native Pinecone reranking for superior relevance
- **Framework Filtering**: Intelligently filters content for Vaadin Flow (Java) vs Hilla (React) frameworks
- **Agent-Friendly**: Provides MCP (Model Context Protocol) server for seamless IDE assistant integration
- **Production Ready**: Clean architecture with dependency injection, comprehensive testing, and error handling
## šļø Architecture
```
vaadin-documentation-services/
āāā packages/
ā āāā core-types/ # Shared TypeScript interfaces
ā āāā 1-asciidoc-converter/ # AsciiDoc ā Markdown + metadata extraction
ā āāā 2-embedding-generator/ # Markdown ā Vector database with hierarchical chunking
ā āāā mcp-server/ # MCP server with hierarchical navigation
āāā package.json # Bun workspace configuration
āāā PROJECT_PLAN.md # Complete project documentation
```
### Data Flow
```mermaid
flowchart TD
subgraph "Step 1: Documentation Processing"
VaadinDocs["š Vaadin Docs<br/>(AsciiDoc)"]
Converter["š AsciiDoc Converter<br/>⢠Framework detection<br/>⢠URL generation<br/>⢠Markdown output"]
Processor["┠Embedding Generator<br/>⢠Hierarchical chunking<br/>⢠Parent-child relationships<br/>⢠OpenAI embeddings"]
end
subgraph "Step 2: Agent Integration"
Pinecone["šļø Pinecone Vector DB<br/>⢠Rich metadata<br/>⢠Hierarchical relationships<br/>⢠Framework tags"]
MCP["š¤ MCP Server<br/>⢠search_vaadin_docs<br/>⢠get_full_document<br/>⢠Full document retrieval"]
IDEs["š» IDE Assistants<br/>⢠Context-aware search<br/>⢠Hierarchical exploration<br/>⢠Framework-specific help"]
end
VaadinDocs --> Converter
Converter --> Processor
Processor --> Pinecone
Pinecone <--> MCP
MCP <--> IDEs
classDef processing fill:#e1f5fe,stroke:#01579b,stroke-width:2px
classDef storage fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
classDef api fill:#e8f5e8,stroke:#2e7d32,stroke-width:2px
classDef agent fill:#fff3e0,stroke:#e65100,stroke-width:2px
class VaadinDocs,Converter,Processor processing
class Pinecone storage
class MCP api
class IDEs agent
```
## ⨠Key Features
### š Intelligent Search
- **Enhanced Hybrid Search**: Combines semantic similarity with intelligent keyword extraction and scoring
- **Native Pinecone Reranking**: Uses Pinecone's bge-reranker-v2-m3 for optimal result ranking
- **Framework Awareness**: Filters Flow vs Hilla content with common content inclusion
- **Query Preprocessing**: Smart keyword extraction with stopword filtering for better search quality
### š³ Hierarchical Navigation
- **Parent-Child Relationships**: Navigate from specific details to broader context
- **Cross-File Links**: Understand relationships between different documentation files
- **Context Breadcrumbs**: Maintain navigation context for better user experience
### šļø Developer Experience
- **MCP Integration**: Standardized protocol for IDE assistant integration
- **TypeScript**: Full type safety across all packages
- **Comprehensive Testing**: Unit tests, integration tests, and hierarchical workflow validation
- **Clean Architecture**: Dependency injection and interface-based design
## š Quick Start
### Prerequisites
- [Bun](https://bun.sh/) runtime v1.3.6
- OpenAI API key (for embeddings)
- Pinecone API key and index
### Installation
```bash
# Clone and install dependencies
git clone https://github.com/vaadin/vaadin-documentation-services
cd vaadin-documentation-services
bun install
```
### Environment Setup
```bash
# Create .env file with your API keys
echo "OPENAI_API_KEY=your_openai_api_key" > .env
echo "PINECONE_API_KEY=your_pinecone_api_key" >> .env
echo "PINECONE_INDEX=your_pinecone_index" >> .env
```
### Running the System
#### 1. Process Documentation (One-time setup)
```bash
# Convert AsciiDoc to Markdown with metadata
cd packages/1-asciidoc-converter
bun run convert
# Generate embeddings and populate vector database
cd ../2-embedding-generator
bun run generate
```
#### 2. Use MCP Server with IDE Assistant
The MCP server is deployed and available remotely via HTTP transport at:
**`https://mcp.vaadin.com/`**
Configure your IDE assistant to use the Streamable HTTP transport:
```javascript
import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";
const transport =