Knowledge Graph RAG for Aerospace Technical Documentation

Hybrid KG-RAG system answering multi-hop, relationship-dense queries about F-16 aircraft

Faithfulness100%

Context Recall94.0%

Context Precision92.5%

Multi-Hop Accuracy91.0%

The Problem

Technical manuals for complex systems like the F-16 contain thousands of pages of relationship-dense specifications, procedures, subsystem dependencies, and failure conditions.

System Demonstration

Solution

Built a hybrid Knowledge Graph RAG system capable of answering multi-hop, relationship-dense queries with high precision and low latency. The orchestration layer is built on an extended and modified version of LightRAG, adapted for dual-cloud storage, domain-specific schema enforcement, entity deduplication, and source citation in output.

Comparative Analysis

Naive RAG: retrieves semantically similar chunks effectively, but struggles with multi-hop reasoning, exact subsystem identifiers, and relationship-aware retrieval across interconnected technical systems.
Microsoft GraphRAG: improves relational reasoning through graph community summarization, but introduces higher latency and retrieval cost for targeted technical queries.
LightRAG-based architecture: extends LightRAG with schema-constrained extraction, hybrid retrieval routing, entity deduplication, and source-grounded responses using Neo4j + Qdrant for low-latency technical intelligence workflows.
Custom KG-RAG pipeline: additionally implemented using Ollama-based extraction on a custom knowledge graph dataset to evaluate schema quality, relationship extraction accuracy, and ingestion tradeoffs.

Key Engineering Learnings

Schema design is foundational

Defining entity and relationship types before extraction is one of the most consequential design decisions in a KG-RAG system. A poorly specified schema produces a graph that is technically valid but useless for the queries that matter.

Hybrid retrieval consistently outperforms either approach alone

Pure vector search misses precise structural relationships. Pure graph traversal misses semantically related content. The hybrid approach (vector search, graph traversal, keyword search) achieves the strongest results across all query types.

KG construction and graph retrieval latency are real constraints

Building a knowledge graph from documents (entity extraction, relationship parsing, graph ingestion) is significantly slower than vector indexing. Graph retrieval also adds overhead compared to approximate nearest-neighbor search.

Entity deduplication matters more than it appears

Without explicit deduplication, the same real-world entity appears as multiple disconnected nodes, fragmenting the graph and degrading retrieval quality.

Use Cases

Best suited for organizations working with large, highly interconnected document sets: technical manuals, regulatory filings, legal archives, and engineering knowledge bases, where standard search or RAG fails to surface relationship-dependent answers.

Tech Stack

Graph Database

Neo4j AuraEntity DeduplicationSchema Constraints

Vector Search

Qdrant Cloud (GCP)ANN Search

Orchestration

LightRAG (modified)Hybrid Retrieval

Inference

Groq LPULlama-3.3-70BStreamlit

Want to Work Together?

Need intelligent Q&A over dense technical documentation or knowledge base? A hybrid KG-RAG approach can dramatically outperform standard RAG. Let's discuss your use case.

✉ Get in Touch

Previous ProjectHierarchical Multi-Agent Decision System for Wargaming

Next Project MARA - Multi-Agent AI Research Assistant