🚀 Flexible GraphRAG
Flexible GraphRAG is a platform that supports document processing, automatic knowledge graph building, RAG and GraphRAG setup, hybrid search (full-text, vector, graph), and AI Q&A query capabilities. It provides a comprehensive solution for handling diverse data sources and conducting complex information retrieval and analysis.
🚀 Quick Start
Flexible GraphRAG is a configurable hybrid search system. It can optionally combine vector similarity search, full-text search, and knowledge graph GraphRAG on documents processed from multiple data sources such as file upload, cloud storage, enterprise repositories, and web sources. Built with LlamaIndex, it offers abstractions to support multiple vector, search graph databases, and LLMs. Documents can be parsed using either Docling (default) or LlamaParse (cloud API). The platform has a FastAPI backend with REST endpoints and a Model Context Protocol (MCP) server for MCP clients like Claude Desktop. It also comes with simple Angular, React, and Vue UI clients to interact with the system.
✨ Features
- Hybrid Search: Combines vector embeddings, BM25 full-text search, and graph traversal for comprehensive document retrieval.
- Knowledge Graph GraphRAG: Extracts entities and relationships from documents to create graphs in graph databases for graph-based reasoning.
- Configurable Architecture: LlamaIndex provides abstractions for vector databases, graph databases, search engines, and LLM providers.
- Multi-Source Ingestion: Processes documents from 13 data sources (file upload, cloud storage, enterprise repositories, web sources) with Docling or LlamaParse document parsing.
- FastAPI Server with REST API: A FastAPI server with REST API for document ingesting, hybrid search, and AI Q&A query.
- MCP Server: An MCP server that provides tools for MCP Clients like Claude Desktop for document and text ingesting, hybrid search, and AI Q&A query.
- UI Clients: Angular, React, and Vue UI clients support choosing the data source (filesystem, Alfresco, CMIS, etc.), ingesting documents, performing hybrid searches, and AI Q&A Queries.
- Docker Deployment Flexibility: Supports both standalone and Docker deployment modes. The Docker infrastructure allows modular database selection via docker - compose, where vector, graph, and search databases can be included or excluded with a single comment. You can choose between hybrid deployment (databases in Docker, backend and UIs standalone) or full containerization.
📦 Installation
Prerequisites
Required
- Python 3.10+ (supports 3.10, 3.11, 3.12, 3.13)
- UV package manager
- Node.js 16+
- npm or yarn
- Neo4j graph database
- Ollama or OpenAI with API key (for LLM processing)
Optional (depending on data source)
- CMIS - compliant repository (e.g., Alfresco) - only if using CMIS data source
- Alfresco repository - only if using Alfresco data source
- File system data source requires no additional setup
Setup
🐳 Docker Deployment
Docker deployment offers two main approaches:
Option A: Databases in Docker, App Standalone (Hybrid)
Best for: Development, external content management systems, flexible deployment
docker-compose -f docker/docker-compose.yaml -p flexible-graphrag up -d
cd flexible-graphrag
uv run start.py
Use cases:
- ✅ File Upload: Direct file upload through web interface
- ✅ External CMIS/Alfresco: Connect to existing content management systems
- ✅ Development: Easy debugging and hot - reloading
- ✅ Mixed environments: Databases in containers, apps on host
Option B: Full Stack in Docker (Complete)
Best for: Production deployment, isolated environments, containerized content sources
docker-compose -f docker/docker-compose.yaml -p flexible-graphrag up -d
Features:
- ✅ All databases pre - configured (Neo4j, Kuzu, Qdrant, Elasticsearch, OpenSearch, Alfresco)
- ✅ Backend + 3 UI clients (Angular, React, Vue) in containers
- ✅ NGINX reverse proxy with unified URLs
- ✅ Persistent data volumes
- ✅ Internal container networking
Service URLs after startup:
- Angular UI: http://localhost:8070/ui/angular/
- React UI: http://localhost:8070/ui/react/
- Vue UI: http://localhost:8070/ui/vue/
- Backend API: http://localhost:8070/api/
- Neo4j Browser: http://localhost:7474/
- Kuzu Explorer: http://localhost:8002/
Data Source Workflow:
- ✅ File Upload: Upload files directly through the web interface (drag & drop or file selection dialog on click)
- ✅ Alfresco/CMIS: Connect to existing Alfresco systems or CMIS repositories
Stopping Services
To stop and remove all Docker services:
docker-compose -f docker/docker-compose.yaml -p flexible-graphrag down
Common workflow for configuration changes:
docker-compose -f docker/docker-compose.yaml -p flexible-graphrag down
docker-compose -f docker/docker-compose.yaml -p flexible-graphrag up -d
Configuration
- Modular deployment: Comment out services you don't need in
docker/docker-compose.yaml
- Environment configuration (for app - stack deployment):
- Environment variables are configured directly in
docker/includes/app-stack.yaml
- Database connections use
host.docker.internal for container - to - container communication
- Default configuration includes OpenAI/Ollama LLM settings and database connections
See docker/README.md for detailed Docker configuration.
🔧 Local Development Setup
Environment Configuration
Create environment file (cross - platform):
cp flexible-graphrag/env-sample.txt flexible-graphrag/.env
copy flexible-graphrag\env-sample.txt flexible-graphrag\.env
Edit .env with your database credentials and API keys.
Python Backend Setup
-
Navigate to the backend directory:
cd project-directory/flexible-graphrag
-
Create a virtual environment using UV and activate it:
uv venv
.\.venv\Scripts\Activate
source .venv/bin/activate
-
Install Python dependencies:
cd flexible-graphrag
uv pip install -r requirements.txt
-
Create a .env file by copying the sample and customizing:
cp env-sample.txt .env
copy env-sample.txt .env
Edit .env with your specific configuration. See docs/ENVIRONMENT-CONFIGURATION.md for detailed setup guide.
Frontend Setup
Production Mode (backend does not serve frontend):
- Backend API: http://localhost:8000 (FastAPI server only)
- Frontend deployment: Separate deployment (nginx, Apache, static hosting, etc.)
- Both standalone and Docker frontends point to backend at localhost:8000
Development Mode (frontend and backend run separately):
- Backend API: http://localhost:8000 (FastAPI server only)
- Angular Dev: http://localhost:4200 (ng serve)
- React Dev: http://localhost:5173 (npm run dev)
- Vue Dev: http://localhost:5174 (npm run dev)
Choose one of the following frontend options to work with:
React Frontend
- Navigate to the React frontend directory:
cd flexible-graphrag-ui/frontend-react
- Install Node.js dependencies:
npm install
- Start the development server (uses Vite):
npm run dev
The React frontend will be available at http://localhost:5174.
Angular Frontend
- Navigate to the Angular frontend directory:
cd flexible-graphrag-ui/frontend-angular
- Install Node.js dependencies:
npm install
- Start the development server (uses Angular CLI):
npm start
The Angular frontend will be available at http://localhost:4200.
Note: If ng build gives budget errors, use npm start for development instead.
Vue Frontend
- Navigate to the Vue frontend directory:
cd flexible-graphrag-ui/frontend-vue
- Install Node.js dependencies:
npm install
- Start the development server (uses Vite):
npm run dev
The Vue frontend will be available at http://localhost:3000.
💻 Usage Examples
Running the Application
Start the Python Backend
From the project root directory:
cd flexible-graphrag
uv run start.py
The backend will be available at http://localhost:8000.
Start Your Preferred Frontend
Follow the instructions in the Frontend Setup section for your chosen frontend framework.
Frontend Deployment
Build Frontend
cd flexible-graphrag-ui/frontend-angular
ng build
cd flexible-graphrag-ui/frontend-react
npm run build
cd flexible-graphrag-ui/frontend-vue
npm run build
Angular Build Notes:
- Budget warnings are common in Angular and usually safe to ignore for development
- For production, consider optimizing bundle sizes or adjusting budget limits in
angular.json
- Development mode: Use
npm start to avoid build issues
Start Production Server
cd flexible-graphrag
uv run start.py
The backend provides:
- API endpoints under
/api/*
- Independent operation focused on data processing and search
- Clean separation from frontend serving concerns
Backend API Endpoints:
- API Base: http://localhost:8000/api/
- API Endpoints:
/api/ingest, /api/search, /api/query, /api/status, etc.
- Health Check: http://localhost:8000/api/health
Frontend Deployment:
- Manual Deployment: Deploy frontends independently using your preferred method (nginx, Apache, static hosting, etc.)
- Frontend Configuration: Both standalone and Docker frontends point to backend at
http://localhost:8000/api/
- Each frontend can be built and deployed separately based on your needs
Usage Steps
The system provides a tabbed interface for document processing and querying. Follow these steps in order:
1. Sources Tab
Configure your data source and select files for processing:
File Upload Data Source
- Select: "File Upload" from the data source dropdown
- Add Files:
- Drag & Drop: Drag files directly onto the upload area
- Click to Select: Click the upload area to open file selection dialog (supports multi - select)
- Note: If you drag & drop new files after selecting via dialog, only the dragged files will be used
- Supported Formats: PDF, DOCX, XLSX, PPTX, TXT, MD, HTML, CSV, PNG, JPG, and more
- Next Step: Click "CONFIGURE PROCESSING →" to proceed to Processing tab
Alfresco Repository
- Select: "Alfresco Repository" from the data source dropdown
- Configure:
- Alfresco Base URL (e.g.,
http://localhost:8080/alfresco)
- Username and password
- Path (e.g.,
/Sites/example/documentLibrary)
- Next Step: Click "CONFIGURE PROCESSING →" to proceed to Processing tab
CMIS Repository
- Select: "CMIS Repository" from the data source dropdown
- Configure:
- CMIS Repository URL (e.g.,
http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom)
- Username and password
- Folder path (e.g.,
/Sites/example/documentLibrary)
- Next Step: Click "CONFIGURE PROCESSING →" to proceed to Processing tab
2. Processing Tab
Process your selected documents and monitor progress:
- Start Processing: Click "START PROCESSING" to begin document ingestion
- Monitor Progress: View real - time progress bars for each file
- File Management:
- Use checkboxes to select files
- Click "REMOVE SELECTED (N)" to remove selected files from the list
- Note: This removes files from the processing queue, not from your system
- Processing Pipeline: Documents are processed through Docling conversion, vector indexing, and knowledge graph creation
3. Search Tab
Perform searches on your processed documents:
Hybrid Search
- Purpose: Find and rank the most relevant document excerpts
- Usage: Enter search terms or phrases (e.g., "machine learning algorithms", "financial projections")
- Action: Click "SEARCH" button
- Results: Ranked list of document excerpts with relevance scores and source information
- Best for: Research, fact - checking, finding specific information across documents
Q&A Query
- Purpose: Get AI - generated answers to natural language questions
- Usage: Enter natural language questions (e.g., "What are the main findings in the research papers?")
- Action: Click "ASK" button
- Results: AI - generated narrative answers that synthesize information from multiple documents
- Best for: Summarization, analysis, getting overviews of complex topics
4. Chat Tab
Interactive conversational interface for document Q&A:
- Chat Interface:
- Your Questions: Displayed on the right side vertically
- AI Answers: Displayed on the left side vertically
- Usage: Type questions and press Enter or click send
- Conversation History: All questions and answers are preserved in the chat history
- Clear History: Click "CLEAR HISTORY" button to start a new conversation
- Best for: Iterative questioning, follow - up queries, conversational document exploration
📚 Documentation
Frontend Screenshots
Angular Frontend - Tabbed Interface
Click to view Angular UI screenshots (Light Theme)
| Sources Tab |
Processing Tab |
Search Tab |
Chat Tab |
|
|
|
|
React Frontend - Tabbed Interface
Click to view React UI screenshots (Dark Theme)
| Sources Tab |
Processing Tab |
Search Tab |
Chat Tab |
|
|
|
|
Click to view React UI screenshots (Light Theme)
| Sources Tab |
Processing Tab |
Search Tab |
Chat Tab |
|
|
|
|
Vue Frontend - Tabbed Interface
Click to view Vue UI screenshots (Light Theme)
| Sources Tab |
Processing Tab |
Search Tab |
Chat Tab |
|
|
|
|
System Components
FastAPI Backend (/flexible-graphrag)
- REST API Server: Provides endpoints for document ingestion, search, and Q&A
- Hybrid Search Engine: Combines vector similarity, BM25, and graph traversal
- Document Processing: Advanced document conversion with Docling integration
- Configurable Architecture: Environment - based configuration for all components
- Async Processing: Background task processing with real - time progress updates
MCP Server (/flexible-graphrag-mcp)
- Claude Desktop Integration: Model Context Protocol server for AI assistant workflows
- Dual Transport: HTTP mode for debugging, stdio mode for Claude Desktop
- Tool Suite: 9 specialized tools for document processing, search, and system management
- Multiple Installation: pipx system installation or uvx no - install execution
UI Clients (/flexible-graphrag-ui)
- Angular Frontend: Material Design with TypeScript
- React Frontend: Modern React with Vite and TypeScript
- Vue Frontend: Vue 3 Composition API with Vuetify and TypeScript
- Unified Features: All clients support async processing, progress tracking, and cancellation
Docker Infrastructure (/docker)
- Modular Database Selection: Include/exclude vector, graph, and search databases with single - line comments
- Flexible Deployment: Hybrid mode (databases in Docker, apps standalone) or full containerization
- NGINX Reverse Proxy: Unified access to all services with proper routing
- Database Dashboards: Integrated web interfaces for Kibana (Elasticsearch), OpenSearch Dashboards, Neo4j Browser, and Kuzu Explorer
Data Sources
Flexible GraphRAG supports 13 different data sources for ingesting documents into your knowledge base:
File & Upload Sources
- File Upload - Direct file upload through web interface with drag & drop support
Cloud Storage Sources
- Amazon S3 - AWS S3 bucket integration
- Google Cloud Storage (GCS) - Google Cloud storage buckets
- Azure Blob Storage - Microsoft Azure blob containers
- OneDrive - Microsoft OneDrive personal/business storage
- SharePoint - Microsoft SharePoint document libraries
- Box - Box.com cloud storage
- Google Drive - Google Drive file storage
Enterprise Repository Sources
- CMIS (Content Management Interoperability Services) - Industry - standard content repository interface
- Alfresco - Alfresco ECM/content repository
Web Sources
- Web Pages - Extract content from web URLs
- Wikipedia - Ingest Wikipedia articles by title or URL
- YouTube - Process YouTube video transcripts
Each data source includes:
- Configuration Forms: Easy - to - use interfaces for credentials and settings
- Progress Tracking: Real - time per - file progress indicators
- Flexible Authentication: Support for various auth methods (API keys, OAuth, service accounts)
Document Processing Options
All data sources support two document parser options:
Docling (Default):
- Open - source, local processing
- Free with no API costs
- Built - in OCR for images and scanned documents
- Configured via:
DOCUMENT_PARSER=docling
LlamaParse:
- Cloud - based API service with advanced AI
- Multimodal parsing with Claude Sonnet 3.5
- Three modes available:
parse_page_without_llm - 1 credit/page
parse_page_with_llm - 3 credits/page (default)
parse_page_with_agent - 10 - 90 credits/page
- Configured via:
DOCUMENT_PARSER=llamaparse + LLAMAPARSE_API_KEY
- Get your API key from LlamaCloud
Both parsers support PDF, Office documents (DOCX, XLSX, PPTX), images, HTML, and more with intelligent format detection.
Supported File Formats
The system processes 15+ document formats through intelligent routing between Docling (advanced processing) and direct text handling:
Document Formats (Docling Processing)
- PDF:
.pdf - Advanced layout analysis, table extraction, formula recognition
- Microsoft Office:
.docx, .xlsx, .pptx - Full structure preservation and content extraction
- Web Formats:
.html, .htm, .xhtml - Markup structure analysis
- Data Formats:
.csv, .xml, .json - Structured data processing
- Documentation:
.asciidoc, .adoc - Technical documentation with markup preservation
Image Formats (Docling OCR)
- Standard Images:
.png, .jpg, .jpeg - OCR text extraction
- Professional Images:
.tiff, .tif, .bmp, .webp - Layout - aware OCR processing
Text Formats (Direct Processing)
- Plain Text:
.txt - Direct ingestion for optimal chunking
- Markdown:
.md, .markdown - Preserved formatting for technical documents
Processing Intelligence
- Adaptive Output: Tables convert to markdown, text content to plain text for optimal entity extraction
- Format Detection: Automatic routing based on file extension and content analysis
- Fallback Handling: Graceful degradation for unsupported formats
Database Configuration
Flexible GraphRAG uses three types of databases for its hybrid search capabilities. Each can be configured independently via environment variables.
Search Databases (Full - Text Search)
Configuration: Set via SEARCH_DB and SEARCH_DB_CONFIG environment variables
-
BM25 (Built - in): Local file - based BM25 full - text search with TF - IDF ranking
-
Elasticsearch: Enterprise search engine with advanced analyzers, faceted search, and real - time analytics
-
OpenSearch: AWS - led open - source fork with native hybrid scoring (vector + BM25) and k - NN algorithms
-
None: Disable full - text search (vector search only)
- Configuration:
SEARCH_DB=none
Vector Databases (Semantic Search)
Configuration: Set via VECTOR_DB and VECTOR_DB_CONFIG environment variables
⚠️ Vector Dimension Compatibility
CRITICAL: When switching between different embedding models (e.g., OpenAI ↔ Ollama), you MUST delete existing vector indexes due to dimension incompatibility:
- OpenAI: 1536 dimensions (text - embedding - 3 - small) or 3072 dimensions (text - embedding - 3 - large)
- Ollama: 384 dimensions (all - minilm, default), 768 dimensions (nomic - embed - text), or 1024 dimensions (mxbai - embed - large)
- Azure OpenAI: Same as OpenAI (1536 or 3072 dimensions)
See [VECTOR - DIMENSIONS.md](VECTOR - DIMENSIONS.md) for detailed cleanup instructions for each database.
Supported Vector Databases
- Neo4j: Can be used as vector database with separate vector configuration
- Dashboard: Neo4j Browser (http://localhost:7474) for Cypher queries and graph visualization
- Configuration:
VECTOR_DB=neo4j
VECTOR_DB_CONFIG={"uri": "bolt://localhost:7687", "username": "neo4j", "password": "your_password", "index_name": "hybrid_search_vector"}
- Qdrant: Dedicated vector database with advanced filtering
- Elasticsearch: Can be used as vector database with separate vector configuration
- OpenSearch: Can be used as vector database with separate vector configuration
- Chroma: Open - source vector database with dual deployment modes
- Dashboard: Swagger UI (http://localhost:8001/docs/) for API testing and management (HTTP mode)
- Configuration (Local Mode):
VECTOR_DB=chroma
VECTOR_DB_CONFIG={"persist_directory": "./chroma_db", "collection_name": "hybrid_search"}
- Configuration (HTTP Mode):
VECTOR_DB=chroma
VECTOR_DB_CONFIG={"host": "localhost", "port": 8001, "collection_name": "hybrid_search"}
- Milvus: Cloud - native, scalable vector database for similarity search
- Weaviate: Vector search engine with semantic capabilities and data enrichment
- Pinecone: Managed vector database service optimized for real - time applications
- Dashboard: Pinecone Console (web - based) for index and namespace management
- Local Info Dashboard: http://localhost:3004 (when using Docker)
- Configuration:
VECTOR_DB=pinecone
VECTOR_DB_CONFIG={"api_key": "your_api_key", "region": "us - east - 1", "cloud": "aws", "index_name": "hybrid - search"}
- PostgreSQL: Traditional database with pgvector extension for vector similarity search
- Dashboard: pgAdmin (http://localhost:5050) for database management, vector queries, and similarity searches
- Configuration:
VECTOR_DB=postgres
VECTOR_DB_CONFIG={"host": "localhost", "port": 5433, "database": "postgres", "username": "postgres", "password": "your_password"}
- LanceDB: Modern, lightweight vector database designed for high - performance ML applications
RAG without GraphRAG
For simpler deployments without knowledge graph extraction, configure:
VECTOR_DB=qdrant
SEARCH_DB=elasticsearch
GRAPH_DB=none
ENABLE_KNOWLEDGE_GRAPH=false
Results:
- Vector similarity search (semantic)
- Full - text search (keyword - based)
- No graph traversal
- Faster processing (no graph extraction)
Graph Databases (Knowledge Graph / GraphRAG)
Configuration: Set via GRAPH_DB and GRAPH_DB_CONFIG environment variables
- Neo4j Property Graph: Primary knowledge graph storage with Cypher querying
- Kuzu: Embedded graph database built for query speed and scalability, optimized for handling complex analytical workloads on very large graph databases. Supports the property graph data model and the Cypher query language
- FalkorDB: "A super fast Graph Database uses GraphBLAS under the hood for its sparse adjacency matrix graph representation. Our goal is to provide the best Knowledge Graph for LLM (GraphRAG)."
- ArcadeDB: Multi - model database supporting graph, document, key - value, and search capabilities with SQL and Cypher query support
- Dashboard: ArcadeDB Studio (http://localhost:2480) for graph visualization, SQL/Cypher queries, and database management
- Configuration:
GRAPH_DB=arcadedb
GRAPH_DB_CONFIG={"host": "localhost", "port": 2480, "username": "root", "password": "password", "database": "flexible_graphrag", "query_language": "sql"}
- MemGraph: Real - time graph database with native support for streaming data and advanced graph algorithms
- NebulaGraph: Distributed graph database designed for large - scale data with horizontal scalability
- Dashboard: NebulaGraph Studio (http://localhost:7001) for graph exploration and nGQL queries
- Configuration:
GRAPH_DB=nebula
GRAPH_DB_CONFIG={"space": "flexible_graphrag", "host": "localhost", "port": 9669, "username": "root", "password": "nebula"}
- Amazon Neptune: Fully managed graph database service supporting both property graph and RDF models
- Amazon Neptune Analytics: Serverless graph analytics engine for large - scale graph analysis with openCypher support
- None: Disable knowledge graph extraction for RAG - only mode
LLM Configuration
Configuration: Set via LLM_PROVIDER and provider - specific environment variables
LLM Providers
-
OpenAI: GPT models with configurable endpoints
-
Ollama: Local LLM deployment for privacy and control
-
Azure OpenAI: Enterprise OpenAI integration
-
Anthropic Claude: Claude models for complex reasoning
-
Google Gemini: Google's latest language models
LLM Performance Recommendations
General Performance with LlamaIndex: OpenAI vs Ollama
Based on testing with OpenAI GPT - 4o - mini and Ollama models (llama3.1:8b, llama3.2:latest, gpt - oss:20b), OpenAI consistently outperforms Ollama models in LlamaIndex operations.
Ollama Configuration
When using Ollama as your LLM provider, you must configure system - wide environment variables before starting the Ollama service. These settings optimize performance and enable parallel processing.
Key requirements:
- Configure environment variables system - wide (not in Flexible GraphRAG
.env file)
OLLAMA_NUM_PARALLEL=4 is critical for parallel document processing
- Always restart Ollama service after changing environment variables
See [docs/OLLAMA - CONFIGURATION.md](docs/OLLAMA - CONFIGURATION.md) for complete setup instructions, including:
- All environment variable configurations
- Platform - specific installation steps (Windows, Linux, macOS)
- Performance optimization guidelines
- Troubleshooting common issues
MCP Tools for MCP Clients like Claude Desktop, etc.
The MCP server provides 9 specialized tools for document intelligence workflows:
| Tool |
Purpose |
Usage |
get_system_status() |
System health and configuration |
Verify setup and database connections |
ingest_documents(data_source, paths) |
Bulk document processing |
Process files/folders from filesystem, CMIS, Alfresco |
ingest_text(content, source_name) |
Custom text analysis |
Analyze specific text content |
search_documents(query, top_k) |
Hybrid document retrieval |
Find relevant document excerpts |
query_documents(query, top_k) |
AI - powered Q&A |
Generate answers from document corpus |
test_with_sample() |
System verification |
Quick test with sample content |
check_processing_status(id) |
Async operation monitoring |
Track long - running ingestion tasks |
get_python_info() |
Environment diagnostics |
Debug Python environment issues |
health_check() |
Backend connectivity |
Verify API server connection |
Client Support
- Claude Desktop and other MCP clients: Native MCP integration with stdio transport
- MCP Inspector: HTTP transport for debugging and development
- Multiple Installation: pipx (system - wide) or uvx (no - install) options
🔧 Technical Details
Technical Implementation
The system combines three retrieval methods for comprehensive hybrid search:
- Vector Similarity Search: Uses embeddings to find semantically similar content based on meaning rather than exact word matches
- Full - Text Search: Keyword - based search using:
- Search Engines: Elasticsearch or OpenSearch (which implement BM25 algorithms)
- Built - in Option: LlamaIndex local BM25 implementation for simpler deployments
- Graph Traversal: Leverages knowledge graphs to find related entities and relationships, enabling GraphRAG (Graph - enhanced Retrieval Augmented Generation) that can surface contextually relevant information through entity connections and semantic relationships
How GraphRAG Works: The system extracts entities (people, organizations, concepts) and relationships from documents, stores them in a graph database, then uses graph traversal during retrieval to find not just direct matches but also related information through entity connections. This enables more comprehensive answers that incorporate contextual relationships between concepts.
Testing Cleanup
Between tests you can clean up data:
- Vector Indexes: See [docs/VECTOR - DIMENSIONS.md](docs/VECTOR - DIMENSIONS.md) for vector database cleanup instructions
- Graph Data: See [flexible - graphrag/README - neo4j.md](flexible - graphrag/README - neo4j.md) for graph - related cleanup commands
- Neo4j: Use on a test Neo4j database no one else is using
📄 License
This project is licensed under the terms of the Apache License 2.0. See the LICENSE file for details.