Source linked

SproutRAG Boosts Long-Document Retrieval by 6.1% With Attention-Guided Tree Search

A hierarchical RAG framework that organizes sentence chunks into a binary tree using learned attention, then retrieves at multiple granularities without extra LLM calls or lossy summarization.

sproutragretrieval augmented generationattention mechanismshierarchical searchlong document ragamir abaskohi

SproutRAG outperforms the strongest baseline by 6.1% average information efficiency across four long-document benchmarks, without a single extra LLM call during retrieval.

Most retrieval-augmented generation (RAG) pipelines choke on long documents because they either split text into fixed-size chunks (losing context) or rely on expensive LLM-generated summaries (losing fidelity). Amir Abaskohi and collaborators at the University of Toronto and Vector Institute sidestep both traps with a hierarchical framework that learns the document's structure from its own attention patterns.

How SproutRAG Builds a Context Tree From Attention Heads

The core trick: SproutRAG treats sentence-level embeddings as leaves, then iteratively merges the most semantically adjacent pairs by analyzing which attention heads and layers best capture inter-sentence relevance. That produces a binary tree where each internal node represents a progressively larger but coherent text block. No LLM calls, no hand-crafted chunking rules.

At retrieval time, SproutRAG runs a hierarchical beam search over this tree. It gathers candidates at multiple granularities simultaneously, pulling out a multi-sentence passage when a single sentence lacks context, but falling back to finer chunks when the signal is sharp. This multi-granularity retrieval is what flat RAG or single-level context expansion cannot do without blowing up cost.

Joint Training Lifts Embeddings and Tree Structure Together

Standard RAG separates embedding training from retrieval logic. SproutRAG trains end-to-end with a joint objective that optimizes both the embedding space and the tree construction heads. The model learns to allocate attention to heads that produce semantically coherent merges, making the tree itself a better retrieval index.

Evaluation spans HotpotQA (open-domain), 2WikiMultihopQA (multi-hop), QASper (legal), and QASA (scientific). SproutRAG posts consistent IE gains across all four, peaking at 8.1% on the scientific QASA dataset. The paper includes ablation studies confirming that the attention-guided tree construction and hierarchical beam search each contribute roughly half the total lift.

Code is on GitHub at github.com/AmirAbaskohi/SproutRAG, enabling direct reproduction. The framework opens a clear path toward retrieval systems that understand document structure as well as a reader does, without burning inference budget on LLM orchestrators.


Source: SproutRAG: Attention-Guided Tree Search with Progressive Embeddings for Long-Document RAG
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.