MCP Tool for Knowledge Graph Construction: Text-to-Graph & Multi-database Integration

Hkg Ontologizer Kgb MCP

A knowledge graph builder that uses a local AI model to convert text or web page content into structured knowledge graphs, supporting large-scale content processing, real-time visualization, and integration with Neo4j and Qdrant databases.

Knowledge management and memory Research and data #Knowledge Graph #AI Processing #Database Integration #Visualization .Python

rating : 2 points

downloads : 5.1K

update time : 2025-07-24

Open Site

What is the Knowledge Graph Builder MCP Server?

This is a knowledge graph construction tool based on a local AI model that can convert any text or web page content into a structured knowledge graph. It uses the MCP protocol to store knowledge in Neo4j and Qdrant databases and provides real-time visualization functionality.

How to use the Knowledge Graph Builder MCP Server?

Through simple text input or URL input, the system will automatically analyze the content, extract entities and relationships, and generate a structured knowledge graph. Users can choose different AI models for processing and also view the real-time updated graphical display.

Applicable Scenarios

Suitable for scenarios that require extracting structured information from a large amount of text, such as academic research, enterprise data analysis, and the construction of intelligent customer service knowledge bases.

Main Features

Local AI Processing

Use local AI models (such as Ollama or LM Studio) for entity extraction to protect data privacy

Large File Support

Can process large content over 300MB, automatically chunk it, and merge the results

Web Page Content Extraction

Can extract and analyze content from any web page without size limitations

Knowledge Graph Generation

Automatically generate a structured knowledge graph containing entities and relationships

Intelligent Chunking

Automatically divide large text into small chunks at sentence boundaries for processing

Entity Merging

Automatically merge duplicate entities in different chunks

Real-time Visualization

Update the knowledge graph in SVG format in real-time as each chunk is processed

Interactive SVG Output

Color-coded entity types and progress tracking functionality

MCP Integration

Store data in Neo4j (graph database) and Qdrant (vector database)

UUID Tracking

Generate a unique identifier for each entity for cross-system tracking

Gradio Interface

Provide a friendly web interface supporting JSON and SVG output

Advantages

Process sensitive data without an internet connection

Support the processing of extremely large files

Provide real-time graphical display

Support the selection of multiple AI models

Automatically merge duplicate entities

Support web page content extraction

Limitations

Need to install local AI models (such as Ollama or LM Studio)

May require more computing resources for very large data sets

Require a certain technical foundation to configure environment variables

How to Use

Install Dependencies

First, install all necessary Python packages, including libraries for visualization and AI processing

Configure Environment Variables

Set environment variables as needed, such as selecting the AI model to use and processing parameters

Start the Application

Run the main program, which will start a web service with a Gradio interface

Input Content

Enter text or a web page link on the interface, and the system will start analyzing and generating a knowledge graph

View Results

The system will return a structured knowledge graph and a real-time updated SVG graphical display

Usage Examples

Enterprise Knowledge Management

Input company internal documents, extract key people, projects, and relationships, and build an enterprise knowledge graph

Academic Research

Input a paper abstract, extract the research topic, method, and related literature

News Analysis

Input a news article, extract the people, places, and events involved

Frequently Asked Questions

Does this tool require an internet connection?

How large a file can it process?

How to choose different AI models?

Can the generated graph be exported?

How to handle errors or exceptions?

Related Resources

Official Documentation

Complete usage guide and API documentation

GitHub Repository

Code repository and project maintenance page

Video Tutorials

Operation demonstration and usage example videos

Community Forum

User communication and question answering platform

🚀 Knowledge Graph Builder MCP Server

A Knowledge Graph Builder that transforms text or web content into structured knowledge graphs using local AI models with MCP (Model Context Protocol) integration for persistent storage in Neo4j and Qdrant.

🚀 Quick Start

The Knowledge Graph Builder MCP Server is a powerful tool that can transform text or web content into structured knowledge graphs. To get started, follow the steps below:

Setup

Install the required packages:

pip install -r requirements.txt

# For full visualization capabilities:
pip install networkx matplotlib

Set up the environment variables. You can use the following basic setup:

# Basic setup (uses sensible defaults)
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=llama3.2:latest

# Optional: Custom endpoints and processing limits
export OLLAMA_BASE_URL=http://localhost:11434
export CHUNK_SIZE=2000
export MAX_CHUNKS=0

Set up the local model. For Ollama, you can use the following commands:

# Install and start Ollama
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve

# Pull a model
ollama pull llama3.2:latest

For LM Studio, you need to download and install it, load a model in the local server, and start the local server on port 1234.

Running the Application

python app.py

The application will launch a Gradio interface with MCP server capabilities enabled.

✨ Features

Local AI Processing: Uses local models via Ollama or LM Studio for entity extraction.
Large Content Support: Handles arbitrarily large content (300MB+) via intelligent chunking.
Web Content Extraction: Scrapes and analyzes full web pages without size limits.
Knowledge Graph Generation: Creates structured graphs with entities and relationships.
Smart Chunking: Automatically chunks large content with sentence boundary detection.
Entity Merging: Intelligently merges duplicate entities across chunks.
Real-Time Visualization: Live SVG graph updates as chunks are processed.
Interactive SVG Output: Color-coded entity types with progress tracking.
MCP Integration: Stores data in Neo4j (graph database) and Qdrant (vector database).
UUID Tracking: Generates UUIDv8 for unified entity tracking across systems.
Gradio Interface: User-friendly web interface with dual JSON/SVG output.

📊 Entity Types Extracted

Property	Details
👥 PERSON	Names, individuals, key figures
🏢 ORGANIZATION	Companies, institutions, groups
📍 LOCATION	Places, countries, regions, addresses
💡 CONCEPT	Ideas, technologies, abstract concepts
📅 EVENT	Specific events, occurrences, incidents
🔧 OTHER	Miscellaneous entities not fitting other categories

📦 Installation

Requirements

pip install -r requirements.txt

# For full visualization capabilities:
pip install networkx matplotlib

Environment Variables

For detailed configuration instructions and complete environment variables reference, see the Configuration section below.

Quick Start Configuration:

# Basic setup (uses sensible defaults)
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=llama3.2:latest

# Optional: Custom endpoints and processing limits
export OLLAMA_BASE_URL=http://localhost:11434
export CHUNK_SIZE=2000
export MAX_CHUNKS=0

Note: All environment variables are optional and have sensible defaults. The application will run without any configuration.

Local Model Setup

For Ollama:

# Install and start Ollama
curl -fsSL https://ollama.ai/install.sh | sh
ollama serve

# Pull a model
ollama pull llama3.2:latest

For LM Studio:

Download and install LM Studio.
Load a model in the local server.
Start the local server on port 1234.

💻 Usage Examples

Text Input

Paste any text content to analyze:

Apple Inc. was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in 1976. The company is headquartered in Cupertino, California.

URL Input

Provide a web URL to extract and analyze:

https://en.wikipedia.org/wiki/Artificial_intelligence

Large Content Processing (300MB+ Files)

# Example: Processing a 300MB conversation log
# The system will automatically:
# 1. Detect large content (>2000 chars by default)
# 2. Split into intelligent chunks at sentence boundaries
# 3. Process each chunk with the local AI model
# 4. Merge and deduplicate entities/relationships
# 5. Store with full lineage tracking in hKG

# Processing will show progress:
# "Processing large content (314,572,800 chars) in chunks..."
# "Processing 157,286 chunks..."
# "Processing chunk 1/157,286 (2000 chars)..."
# "Merged results: 45,231 entities, 128,904 relationships"

Output Format

The system returns a structured JSON knowledge graph:

{
  "source": {
    "type": "text|url",
    "value": "input_value",
    "content_preview": "first 200 characters..."
  },
  "knowledge_graph": {
    "entities": [
      {
        "name": "Apple Inc.",
        "type": "ORGANIZATION",
        "description": "Technology company founded in 1976"
      }
    ],
    "relationships": [
      {
        "source": "Steve Jobs",
        "target": "Apple Inc.",
        "relationship": "FOUNDED",
        "description": "Steve Jobs founded Apple Inc."
      }
    ],
    "entity_count": 5,
    "relationship_count": 4
  },
  "visualization": {
    "svg_content": "<svg>...</svg>",
    "svg_file_path": "/path/to/knowledge_graph_12345678.svg",
    "visualization_available": true,
    "real_time_updates": false,
    "incremental_files_saved": 0,
    "entity_color_mapping": {
      "ORGANIZATION": "#4ECDC4",
      "PERSON": "#FF6B6B"
    },
    "svg_generation_timestamp": "2024-01-15T10:30:05Z",
    "visualization_engine": "networkx+matplotlib"
  },
  "metadata": {
    "model": "ollama:llama3.2:latest",
    "content_length": 150,
    "uuid": "xxxxxxxx-xxxx-8xxx-xxxx-xxxxxxxxxxxx",
    "neo4j_stored": true,
    "qdrant_stored": true,
    "timestamp": "2024-01-15T10:30:00Z",
    "hkg_metadata": {
      "processing_method": "single",
      "chunk_count": 1,
      "chunk_size": 2000,
      "chunk_overlap": 200,
      "source_type": "text",
      "supports_large_content": true,
      "max_content_size": "unlimited",
      "visualization_integration": {
        "real_time_visualization": false,
        "svg_files_generated": 1,
        "entity_color_tracking": true,
        "visualization_lineage": true,
        "incremental_updates": false,
        "neo4j_viz_metadata": true,
        "qdrant_viz_metadata": true
      }
    }
  }
}

🎨 Real-Time Graph Visualization

SVG Generation Features

Color-Coded Entity Types: Each entity type has a distinct color (Person=Red, Organization=Teal, Location=Blue, Concept=Green, Event=Yellow, Other=Plum).
Interactive Layout: Automatic graph layout using NetworkX spring layout algorithm.
Relationship Labels: Edge labels showing relationship types between entities.
Entity Information: Node labels with entity names and types.
Legend: Automatic legend generation based on entity types present.
Statistics: Real-time entity and relationship counts.

Real-Time Processing for Large Content

Progress Tracking: Visual progress bar showing chunk processing completion.
Incremental Updates: Graph updates after each chunk is processed.
Live Statistics: Running totals of entities and relationships discovered.
Incremental File Saves: Each chunk creates a timestamped SVG file.
Final Visualization: Complete graph saved as final SVG.

File Output

Single Content: knowledge_graph_<uuid8>.svg
Large Content (Chunked):
- Incremental: knowledge_graph_<uuid8>_chunk_0001.svg, chunk_0002.svg, etc.
- Final: knowledge_graph_<uuid8>.svg

Example Large Content Processing

# Processing a 300MB conversation log:
# "Processing large content (314,572,800 chars) in chunks..."
# "Processing 157,286 chunks..."
# 
# Real-time updates:
# "Processing chunk 1/157,286 (2000 chars)..."
# "Real-time graph updated: Updated graph: 5 entities, 3 relationships (Chunk 1/157,286)"
# "Saved incremental graph: knowledge_graph_12345678_chunk_0001.svg"
# 
# "Processing chunk 2/157,286 (2000 chars)..."
# "Real-time graph updated: Updated graph: 12 entities, 8 relationships (Chunk 2/157,286)"
# "Saved incremental graph: knowledge_graph_12345678_chunk_0002.svg"
# 
# ... continues for all chunks ...
# 
# "Final results: 45,231 entities, 128,904 relationships"
# "Final SVG visualization saved: knowledge_graph_12345678.svg"

🗄️ hKG (Hybrid Knowledge Graph) Storage with Visualization Integration

Neo4j Integration (Graph Database)

Stores entities as nodes with properties and enhanced metadata.
Creates relationships between entities with lineage tracking.
Maintains UUIDv8 for entity tracking across all databases.
Tracks chunking metadata for large content processing.
Records processing method (single vs chunked).
NEW: Visualization metadata in entity observations including:
- SVG file paths and availability status.
- Entity color mappings for graph visualization.
- Real-time update tracking for chunked processing.
- Incremental file counts for large content processing.
Accessible via MCP server tools.

Qdrant Integration (Vector Database)

Stores knowledge graphs as vector embeddings with enhanced metadata.
Enables semantic search across graphs of any size.
Maintains metadata for each knowledge graph including chunk information.
Tracks content length, processing method, and chunk count.
Supports similarity search across large document collections.
NEW: Visualization lineage tracking including:
- Entity type and color mapping information.
- SVG generation timestamps and file paths.
- Real-time visualization update history.
- Incremental SVG file tracking for large content.
Accessible via MCP server tools.

hKG Unified Tracking with Visualization Lineage

UUIDv8 Across All Systems: Common ancestry-encoded identifiers.
Content Lineage: Track how large content was processed and chunked.
Processing Metadata: Record chunk size, overlap, and processing method.
Entity Provenance: Track which chunks contributed to each entity.
Relationship Mapping: Maintain relationships across chunk boundaries.
Semantic Coherence: Ensure knowledge graph consistency across databases.
NEW - Visualization Lineage: Complete tracking of visual representation:
- SVG File Provenance: Track all generated visualization files.
- Color Mapping Consistency: Maintain entity color assignments across chunks.
- Real-Time Update History: Log all incremental visualization updates.
- Cross-Database Visual Metadata: Synchronized visualization tracking in both Neo4j and Qdrant.
- Incremental Visualization Tracking: Complete audit trail of real-time graph updates.

🔧 Technical Details

Core Components

app.py: Main application file with Gradio interface.
extract_text_from_url(): Web scraping functionality (app.py:41).
chunk_text(): Smart content chunking with sentence boundary detection (app.py:214).
merge_extraction_results(): Intelligent merging of chunk results (app.py:250).
get_entity_color(): Entity type color mapping (app.py:299).
create_knowledge_graph_svg(): SVG graph generation (app.py:311).
RealTimeGraphVisualizer: Real-time incremental visualization (app.py:453).
extract_entities_and_relationships(): AI-powered entity extraction with real-time updates (app.py:645).
extract_entities_and_relationships_single(): Single chunk processing (app.py:722).
build_knowledge_graph(): Main orchestration function with visualization (app.py:795).
generate_uuidv8(): UUID generation for entity tracking (app.py:68).

Data Flow with hKG Integration and Real-Time Visualization

Input Processing: Text or URL input validation.
Content Extraction: Web scraping for URLs, direct text for text input.
Real-Time Visualizer Setup: Initialize incremental graph visualization system.
Content Chunking: Smart chunking for large content (>2000 chars) with sentence boundary detection.
AI Analysis with Live Updates: Local model processes each chunk for entities/relationships.
Incremental Visualization: Real-time SVG graph updates after each chunk completion.
Result Merging: Intelligent deduplication and merging of entities/relationships across chunks.
hKG Metadata Creation: Generate processing metadata for lineage tracking.
Graph Generation: Structured knowledge graph creation with enhanced metadata.
Final Visualization: Generate complete SVG graph with all entities and relationships.
hKG Storage: Persistence in Neo4j (graph) and Qdrant (vector) with unified UUIDv8 tracking.
Output: JSON response with complete knowledge graph, hKG metadata, and SVG visualization.

🎛️ Documentation

Environment Variables Reference

All configuration is handled through environment variables. The application provides sensible defaults for all settings, allowing it to run without any configuration while still offering full customization.

Property	Details
MODEL_PROVIDER	AI model provider to use. Default: `"ollama"`. Example Values: `"ollama"`, `"lmstudio"`
LOCAL_MODEL	Local model identifier. Default: `"llama3.2:latest"`. Example Values: `"llama3.2:latest"`, `"mistral:7b"`, `"codellama:13b"`
OLLAMA_BASE_URL	Ollama API endpoint. Default: `"http://localhost:11434"`. Example Values: `"http://localhost:11434"`, `"http://192.168.1.100:11434"`
LMSTUDIO_BASE_URL	LM Studio API endpoint. Default: `"http://localhost:1234"`. Example Values: `"http://localhost:1234"`, `"http://127.0.0.1:1234"`
CHUNK_SIZE	Characters per chunk for AI processing. Default: `2000`. Example Values: `1000`, `2000`, `4000`, `8000`
CHUNK_OVERLAP	Overlap between chunks for context. Default: `200`. Example Values: `100`, `200`, `400`, `500`
MAX_CHUNKS	Maximum chunks to process (0=unlimited). Default: `0`. Example Values: `0`, `100`, `1000`, `5000`
HF_TOKEN	HuggingFace API token (legacy, unused). Default: `None`. Example Values: `"hf_xxxxxxxxxxxx"`

Configuration Methods

1. Environment Variables (Recommended)

# Core Model Configuration
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=llama3.2:latest
export OLLAMA_BASE_URL=http://localhost:11434

# Large Content Processing
export CHUNK_SIZE=2000
export CHUNK_OVERLAP=200
export MAX_CHUNKS=0

2. Shell Configuration (.bashrc/.zshrc)

# Add to ~/.bashrc or ~/.zshrc
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=llama3.2:latest
export OLLAMA_BASE_URL=http://localhost:11434
export CHUNK_SIZE=2000
export CHUNK_OVERLAP=200
export MAX_CHUNKS=0

3. Python Environment File (.env)

# Create .env file in project root
MODEL_PROVIDER=ollama
LOCAL_MODEL=llama3.2:latest
OLLAMA_BASE_URL=http://localhost:11434
LMSTUDIO_BASE_URL=http://localhost:1234
CHUNK_SIZE=2000
CHUNK_OVERLAP=200
MAX_CHUNKS=0

Model Provider Configuration

Ollama Configuration (Default)

# Basic Ollama setup
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=llama3.2:latest
export OLLAMA_BASE_URL=http://localhost:11434

# Alternative models
export LOCAL_MODEL=mistral:7b          # Mistral 7B
export LOCAL_MODEL=codellama:13b       # Code Llama 13B
export LOCAL_MODEL=llama3.2:3b         # Llama 3.2 3B (faster)
export LOCAL_MODEL=phi3:mini           # Phi-3 Mini (lightweight)

# Remote Ollama instance
export OLLAMA_BASE_URL=http://192.168.1.100:11434

LM Studio Configuration

# Basic LM Studio setup
export MODEL_PROVIDER=lmstudio
export LOCAL_MODEL=any-model-name      # Model name is flexible for LM Studio
export LMSTUDIO_BASE_URL=http://localhost:1234

# Custom LM Studio port
export LMSTUDIO_BASE_URL=http://localhost:8080

# Remote LM Studio instance
export LMSTUDIO_BASE_URL=http://192.168.1.200:1234

Large Content Processing Configuration

Chunk Size Optimization

# Small chunks (faster processing, more chunks)
export CHUNK_SIZE=1000
export CHUNK_OVERLAP=100

# Medium chunks (balanced performance)
export CHUNK_SIZE=2000    # Default
export CHUNK_OVERLAP=200  # Default

# Large chunks (fewer chunks, more context)
export CHUNK_SIZE=4000
export CHUNK_OVERLAP=400

# Very large chunks (maximum context, slower)
export CHUNK_SIZE=8000
export CHUNK_OVERLAP=800

Processing Limits

# Unlimited processing (default)
export MAX_CHUNKS=0

# Process only first 100 chunks (testing)
export MAX_CHUNKS=100

# Process first 1000 chunks (moderate datasets)
export MAX_CHUNKS=1000

# Process first 10000 chunks (large datasets)
export MAX_CHUNKS=10000

Performance Tuning Guidelines

For Speed Optimization

# Smaller chunks, less overlap, limited processing
export CHUNK_SIZE=1000
export CHUNK_OVERLAP=50
export MAX_CHUNKS=500
export LOCAL_MODEL=llama3.2:3b  # Faster model

For Quality Optimization

# Larger chunks, more overlap, unlimited processing
export CHUNK_SIZE=4000
export CHUNK_OVERLAP=400
export MAX_CHUNKS=0
export LOCAL_MODEL=llama3.2:latest  # Full model

For Memory-Constrained Systems

# Balanced settings for limited resources
export CHUNK_SIZE=1500
export CHUNK_OVERLAP=150
export MAX_CHUNKS=1000
export LOCAL_MODEL=phi3:mini  # Lightweight model

Configuration Validation

The application performs automatic validation of configuration settings:

Model Provider: Validates MODEL_PROVIDER is either "ollama" or "lmstudio".
URLs: Validates that provider URLs are accessible.
Numeric Values: Ensures CHUNK_SIZE, CHUNK_OVERLAP, and MAX_CHUNKS are valid integers.
Model Availability: Checks if the specified model is available on the provider.

Configuration Troubleshooting

Common Issues and Solutions

1. Model Provider Not Responding

# Check if Ollama is running
curl http://localhost:11434/api/version

# Check if LM Studio is running
curl http://localhost:1234/v1/models

# Solution: Start the appropriate service
ollama serve  # For Ollama
# Or start LM Studio GUI and enable local server

2. Model Not Found

# List available Ollama models
ollama list

# Pull missing model
ollama pull llama3.2:latest

# For LM Studio: Load model in GUI

3. Memory Issues with Large Content

# Reduce chunk size and set limits
export CHUNK_SIZE=1000
export MAX_CHUNKS=100

# Use lighter model
export LOCAL_MODEL=llama3.2:3b

4. Slow Processing

# Optimize for speed
export CHUNK_SIZE=1500
export CHUNK_OVERLAP=100
export MAX_CHUNKS=500
export LOCAL_MODEL=phi3:mini

Example Configuration Scenarios

Scenario 1: Development Setup

# Fast iteration, limited processing
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=llama3.2:3b
export CHUNK_SIZE=1000
export CHUNK_OVERLAP=100
export MAX_CHUNKS=50

Scenario 2: Production Setup

# High quality, unlimited processing
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=llama3.2:latest
export CHUNK_SIZE=3000
export CHUNK_OVERLAP=300
export MAX_CHUNKS=0

Scenario 3: Large Dataset Processing

# Optimized for 300MB+ files
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=llama3.2:latest
export CHUNK_SIZE=2000
export CHUNK_OVERLAP=200
export MAX_CHUNKS=0

Scenario 4: Resource-Constrained Environment

# Minimal resource usage
export MODEL_PROVIDER=ollama
export LOCAL_MODEL=phi3:mini
export CHUNK_SIZE=800
export CHUNK_OVERLAP=50
export MAX_CHUNKS=200

Advanced Configuration

Custom Model Endpoints

# Docker-based Ollama
export OLLAMA_BASE_URL=http://ollama-container:11434

# Kubernetes service
export OLLAMA_BASE_URL=http://ollama-service.default.svc.cluster.local:11434

# Load balancer
export OLLAMA_BASE_URL=http://ollama-lb.example.com:11434

Dynamic Configuration

The application reads environment variables at startup. To change configuration:

Set new environment variables.
Restart the application.
Configuration changes take effect immediately.

Error Handling

Comprehensive error handling for:

Invalid URLs or network failures.
Missing local models or API endpoints.
JSON parsing errors from LLM responses.
Malformed or empty inputs.
Database connection issues.
Invalid configuration values.
Model provider connectivity issues.
Memory constraints during large content processing.

🔍 hKG MCP Integration with Visual Lineage

The application integrates with MCP servers for hybrid knowledge graph storage with complete visualization tracking:

Neo4j: Graph database storage and querying with enhanced metadata + visualization lineage.
Qdrant: Vector database for semantic search with chunk tracking + visual metadata.
Unified Tracking: UUIDv8 across all storage systems for entity lineage + visualization provenance.
Metadata Persistence: Processing method, chunk count, content lineage + SVG generation tracking.
Large Content Support: Seamless handling of 300MB+ content via chunking + real-time visualization.
Visualization Integration: Complete visual representation tracking across all storage systems.

Enhanced hKG Features via MCP

Entity Provenance: Track which content chunks contributed to each entity + their visual representation.
Relationship Lineage: Maintain relationships across chunk boundaries + visual edge tracking.
Content Ancestry: UUIDv8 encoding for hierarchical content tracking + visualization file lineage.
Processing Audit: Complete record of how large content was processed + visualization generation.
Semantic Search: Vector similarity across knowledge graphs of any size + visual metadata search.
NEW - Visual Lineage: Complete visualization tracking including:
- SVG File Provenance: Track all generated visualization files with timestamps.
- Entity Color Consistency: Maintain color mappings across all chunks and storage systems.
- Real-Time Visualization History: Log every incremental graph update during processing.
- Cross-Database Visual Sync: Synchronized visualization metadata in Neo4j and Qdrant.
- Incremental Visualization Audit: Complete trail of real-time updates for large content.

Visualization-Enhanced Storage

Neo4j Entity Observations now include:
- SVG file paths and generation status.
- Entity color assignments for visual consistency.
- Real-time update counts for chunked processing.
- Visualization availability and engine information.
Qdrant Vector Content now includes:
- Entity color mapping information for similarity search.
- SVG generation timestamps and file paths.
- Real-time visualization update metadata.
- Incremental file tracking for large content visualization.

MCP tools are automatically available when running in Claude Code environment with MCP servers configured.

🎯 hKG Visualization Architecture

Integrated Visualization Lineage System

The hKG system now maintains complete visualization lineage alongside traditional knowledge graph storage:

┌─────────────────┐    ┌──────────────────────┐    ┌─────────────────────┐
│   Source Text   │───▶│  Chunking + AI       │───▶│  Entity/Relation    │
│   (300MB+)      │    │  Processing          │    │  Extraction         │
└─────────────────┘    └──────────────────────┘    └─────────────────────┘
                                 │                           │
                                 ▼                           ▼
┌─────────────────┐    ┌──────────────────────┐    ┌─────────────────────┐
│ Real-Time SVG   │◀───│  Incremental Graph   │◀───│  Merged Results     │
│ Generation      │    │  Visualization       │    │  + Deduplication    │
└─────────────────┘    └──────────────────────┘    └─────────────────────┘
         │                        │                           │
         ▼                        ▼                           ▼
┌─────────────────┐    ┌──────────────────────┐    ┌─────────────────────┐
│ SVG File        │    │  Visualization       │    │  hKG Storage        │
│ Storage         │    │  Metadata Creation   │    │  (Neo4j + Qdrant)  │
│ (Incremental)   │    │                      │    │  + Viz Metadata     │
└─────────────────┘    └──────────────────────┘    └─────────────────────┘

Visualization Metadata Flow

Real-Time Updates: Each chunk generates incremental SVG with progress tracking.
Color Consistency: Entity colors maintained across all chunks and storage systems.
File Lineage: Complete audit trail of all generated SVG files.
Cross-Database Sync: Visualization metadata synchronized in both Neo4j and Qdrant.
Provenance Tracking: Link between source chunks, entities, and their visual representation.

hKG Benefits for Large Content (300MB+)

Visual Progress Monitoring: Real-time graph evolution during processing.
Chunk-Level Visualization: Individual SVG files for each processing stage.
Complete Audit Trail: Full lineage from source text to final visualization.
Cross-Reference Capability: Link entities back to their source chunks and visual appearance.
Scalable Visualization: Handles arbitrarily large graphs with consistent performance.

📊 Development

Project Structure

KGB-mcp/
├── app.py                 # Main application
├── requirements.txt       # Dependencies
├── CLAUDE.md             # Claude Code instructions
├── ARCHITECTURE.md       # System architecture
├── test_core.py          # Core functionality tests
└── test_integration.py   # Integration tests

Testing

# Run core tests
python test_core.py

# Run integration tests
python test_integration.py

Transform any content into structured knowledge graphs with the power of local AI and MCP integration!

Notion Api MCP

Certified

A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.

Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.

The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.

TypeScript

17.0K

4.3 points

Duckduckgo MCP Server

Certified

The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.

UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.

20.5K

5 points

Figma Context MCP

Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.

The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.

A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.

TypeScript

15.1K

4.5 points

Zhiqi Future, Your AI Solution Think Tank

English 简体中文繁體中文にほんご

Hkg Ontologizer Kgb MCP

Overview

Content Details

Alternatives

What is the Knowledge Graph Builder MCP Server?

How to use the Knowledge Graph Builder MCP Server?

Applicable Scenarios

Main Features

How to Use

Usage Examples

Frequently Asked Questions

Related Resources

Installation

🚀 Knowledge Graph Builder MCP Server

🚀 Quick Start

Setup

Running the Application

✨ Features

📊 Entity Types Extracted

📦 Installation

Requirements

Environment Variables

Local Model Setup

💻 Usage Examples

Text Input

URL Input

Large Content Processing (300MB+ Files)

Output Format

🎨 Real-Time Graph Visualization

SVG Generation Features

Real-Time Processing for Large Content

File Output

Example Large Content Processing

🗄️ hKG (Hybrid Knowledge Graph) Storage with Visualization Integration

Neo4j Integration (Graph Database)

Qdrant Integration (Vector Database)

hKG Unified Tracking with Visualization Lineage

🔧 Technical Details

Core Components

Data Flow with hKG Integration and Real-Time Visualization

🎛️ Documentation

Environment Variables Reference

Configuration Methods

1. Environment Variables (Recommended)

2. Shell Configuration (.bashrc/.zshrc)

3. Python Environment File (.env)

Model Provider Configuration

Ollama Configuration (Default)

LM Studio Configuration

Large Content Processing Configuration

Chunk Size Optimization

Processing Limits

Performance Tuning Guidelines

For Speed Optimization

For Quality Optimization

For Memory-Constrained Systems

Configuration Validation

Configuration Troubleshooting

Common Issues and Solutions

Example Configuration Scenarios

Scenario 1: Development Setup

Scenario 2: Production Setup

Scenario 3: Large Dataset Processing

Scenario 4: Resource-Constrained Environment

Advanced Configuration

Custom Model Endpoints

Dynamic Configuration

Error Handling

🔍 hKG MCP Integration with Visual Lineage

Enhanced hKG Features via MCP

Visualization-Enhanced Storage

🎯 hKG Visualization Architecture

Integrated Visualization Lineage System

Visualization Metadata Flow

hKG Benefits for Large Content (300MB+)

📊 Development

Project Structure

Testing

Alternatives