Graph RAG: Using Knowledge Graphs to Enhance Retrieval-Augmented Generation

Why Standard RAG Fails on Multi-Hop Questions

Standard vector-based RAG excels at finding passages that are semantically similar to a query. But it struggles with questions that require connecting information across multiple documents. Consider: "Which team leads worked on projects that exceeded budget in Q3 and also had customer escalations?"

This question requires linking people to projects, projects to budgets, and projects to escalations — relationships scattered across different documents. Vector similarity search retrieves isolated chunks but cannot traverse these connections. Graph RAG solves this by building a knowledge graph that explicitly represents entities and their relationships.

How Graph RAG Works

Graph RAG operates in two phases. During indexing, an LLM extracts entities (people, organizations, concepts, events) and relationships from source documents, then organizes them into a knowledge graph. During retrieval, the system uses both graph traversal and vector search to find relevant context.

Microsoft's GraphRAG implementation adds a powerful concept called community detection. It groups related entities into hierarchical communities, generates summaries for each community, and uses these summaries to answer broad questions that span the entire corpus — something standard RAG cannot do at all.

Building a Graph RAG Pipeline

Here is a practical implementation that constructs a knowledge graph from documents and queries it:

See AI Voice Agents Handle Real Calls

Book a free demo or calculate how much you can save with AI voice automation.

Book a Demo ROI Calculator

import networkx as nx
from openai import OpenAI
from dataclasses import dataclass

client = OpenAI()

@dataclass
class Entity:
    name: str
    entity_type: str
    description: str

@dataclass
class Relationship:
    source: str
    target: str
    relation: str
    description: str

def extract_graph_elements(text: str) -> tuple[
    list[Entity], list[Relationship]
]:
    """Extract entities and relationships from text using LLM."""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": """Extract entities and relationships
            from the text. Return JSON with:
            - entities: [{name, type, description}]
            - relationships: [{source, target, relation,
              description}]"""
        }, {
            "role": "user",
            "content": text
        }],
        response_format={"type": "json_object"}
    )
    import json
    data = json.loads(response.choices[0].message.content)

    entities = [Entity(**e) for e in data.get("entities", [])]
    relationships = [
        Relationship(**r)
        for r in data.get("relationships", [])
    ]
    return entities, relationships

def build_knowledge_graph(
    documents: list[str],
) -> nx.DiGraph:
    """Build a knowledge graph from a list of documents."""
    graph = nx.DiGraph()

    for doc in documents:
        entities, relationships = extract_graph_elements(doc)

        for entity in entities:
            graph.add_node(
                entity.name,
                type=entity.entity_type,
                description=entity.description,
            )

        for rel in relationships:
            graph.add_edge(
                rel.source,
                rel.target,
                relation=rel.relation,
                description=rel.description,
            )

    return graph

Querying the Knowledge Graph

Once the graph is built, you combine graph traversal with traditional retrieval:

def graph_rag_query(
    query: str,
    graph: nx.DiGraph,
    depth: int = 2,
) -> str:
    """Answer a query using knowledge graph traversal."""
    # Step 1: Identify entities mentioned in the query
    entity_response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": f"Extract entity names from: {query}"
        }],
    )
    query_entities = entity_response.choices[0].message.content

    # Step 2: Find matching nodes and their neighborhoods
    context_parts = []
    for node in graph.nodes():
        if node.lower() in query_entities.lower():
            # Get the local subgraph around this entity
            subgraph = nx.ego_graph(graph, node, radius=depth)
            for u, v, data in subgraph.edges(data=True):
                context_parts.append(
                    f"{u} --[{data['relation']}]--> {v}: "
                    f"{data.get('description', '')}"
                )

    context = "\n".join(context_parts)

    # Step 3: Generate answer using graph context
    answer = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "system",
            "content": "Answer using the knowledge graph context."
        }, {
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {query}"
        }],
    )
    return answer.choices[0].message.content

Community Summaries for Global Questions

Microsoft GraphRAG's key innovation is community-level summarization. After building the graph, the Leiden algorithm clusters densely connected entities into communities. Each community gets an LLM-generated summary. When a broad question like "What are the main themes across all research?" arrives, the system queries community summaries rather than individual chunks — enabling corpus-wide reasoning that standard RAG cannot achieve.

When Graph RAG Outperforms Standard RAG

Graph RAG shines with multi-hop reasoning questions, corpus-wide summarization tasks, and domains with rich entity relationships like legal, medical, and financial documents. The tradeoff is higher indexing cost because every document must be processed by an LLM to extract entities and relationships, and the graph must be maintained as documents change.

FAQ

How much does it cost to build a knowledge graph with LLM extraction?

Expect roughly 2-5x the cost of standard embedding-based indexing because every document chunk requires an LLM call for entity and relationship extraction. For a corpus of 10,000 documents, this might cost $50-200 depending on document length and model choice. The investment pays off when your use case involves complex relational questions.

Can I use Graph RAG with an existing vector store?

Yes, and this is the recommended approach. Use vector search for semantic similarity retrieval and graph traversal for relational queries, then merge the results. This hybrid approach gives you the best of both worlds — semantic matching plus structured relationship reasoning.

What is the difference between Microsoft GraphRAG and building my own?

Microsoft GraphRAG provides community detection, hierarchical summarization, and global search capabilities out of the box. Building your own gives you more control over entity extraction and graph structure but requires implementing community detection and summarization yourself. For most teams, starting with Microsoft GraphRAG and customizing from there is the faster path.

#GraphRAG #KnowledgeGraphs #RAG #MicrosoftGraphRAG #EntityLinking #AgenticAI #LearnAI #AIEngineering

Graph RAG: Using Knowledge Graphs to Enhance Retrieval-Augmented Generation

Why Standard RAG Fails on Multi-Hop Questions

How Graph RAG Works

Building a Graph RAG Pipeline

Querying the Knowledge Graph

Community Summaries for Global Questions

When Graph RAG Outperforms Standard RAG

FAQ

How much does it cost to build a knowledge graph with LLM extraction?

Can I use Graph RAG with an existing vector store?

What is the difference between Microsoft GraphRAG and building my own?

Try CallSphere AI Voice Agents

Related Articles

WebArena and Real-World Web Agent Benchmarks: How We Measure Browser Agent Performance

Taking Screenshots and Recording Videos with Playwright for AI Analysis

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding