HippoRAG Runs Personalized PageRank on Neptune for Single-Pass Multi-Hop QA

HippoRAG hit 0.8234 on a PageRank score when asked “What is the connection between Leonardo da Vinci and France?” — pulling together a death location and a royal patron from separate documents in a single graph traversal.

That’s the difference between this neurobiologically inspired RAG variant and the rest: it doesn’t iterate over documents. It builds a knowledge graph, seeds it with query entities, and runs Personalized PageRank to propagate relevance across edges. One pass, multi-hop answer.

Why HippoRAG Ditches Iterative Retrieval

Standard RAG treats each document independently. For “Who painted the Mona Lisa and where is it housed?” you’d need two separate retrievals, then merge. HippoRAG stores entities and relations as a graph. Amazon Bedrock extracts triples from each passage, writes them into Amazon Neptune as nodes and edges, and then uses Neptune Analytics to execute Personalized PageRank.

The original HippoRAG paper (Stanford, 2024) showed this works. The AWS stack makes it enterprise-ready: Neptune for the graph, Bedrock for LLM calls (Claude 3.5 Haiku in the demo), and Titan Embeddings for vector similarity on phrase nodes. No custom graph engine needed.

Building the Knowledge Graph with Bedrock and Neptune

The pipeline reads HotpotQA JSON, runs each paragraph through us.anthropic.claude-3-5-haiku-20241022-v1:0 to extract subject-relation-object triples, and writes CSV files for Neptune’s bulk loader. Four CSV files: phrase nodes, passage nodes, relation edges, context edges. Every entity gets a UUID, every relation a label.

Phrase nodes are also embedded with amazon.titan-embed-text-v2:0. When a query comes in, entity matching uses both string similarity and vector search via Titan embeddings. The matched entities become seed nodes for PageRank.

Personalized PageRank as a Single-Step Multi-Hop Engine

Here’s the key call: CALL neptune.algo.pagerank({sourceNodes: [seed_list], dampingFactor: 0.85, personalized: true}). That runs inside Neptune Analytics, returning top-scored nodes. The demo uses 20 iterations, tolerance 1e-4, returns top 100 results. Those scores then rank passages.

Compare the result: for “Which Stanford professor works on the neuroscience of Alzheimer’s?” — three seed entities (Stanford, neuroscience, Alzheimer’s). PageRank propagates through co-occurring passages, professors, and research topics in one pass. The top-ranked documents are the ones that connect all three, not just match one keyword.

What 0.8234 Actually Means for Enterprise QA

In the demo, the simple query “Who painted the Mona Lisa?” gets 0.8234 on the correct document. That’s high precision. The multi-hop query about Leonardo and France gets 0.9156. Because PageRank assigns scores based on graph centrality relative to the seeds, not just term frequency.

This matters for legal case analysis, medical literature reviews, or any domain where answers require connecting facts from multiple sources. No more chaining retrieval calls and hoping the LLM reconciles contradictions. The graph does the linking.

Next time you need multi-hop QA, skip the iterative loop. Build a graph, seed it, and let PageRank do the heavy lifting. Neptune Analytics can handle graphs of hundreds of millions of nodes; the bottleneck is just the LLM triple extraction, which Bedrock scales trivially.

Source: HippoRAG: Neurobiologically inspired RAG using Amazon Bedrock, Amazon Neptune, and personalized PageRank
Domain: aws.amazon.com