On BrowseComp-Plus, RISE matches a brute-force shell-based search agent at 78% accuracy while costing roughly one quarter per query — but the real insight is why we need to rethink retrieval for agents.
Current retrieval inherits a non-agentic mindset: rank the corpus, feed the top documents to an LLM. That works when you just need facts, but agents need to interact. Recent Direct Corpus Interaction (DCI) work lets agents use shell tools like grep and file reads. Fast, but unbounded. Every broad command scans the whole corpus, and as the corpus grows, latency craters.
Retrieval Should Build a Bounded Playground, Not Just a Reading List
The paper argues that retrieval for agents should construct an interaction space: a bounded subset of the corpus the agent can explore with associated tools. Two design consequences: the space needs a boundary (supplied by retrieval), and the objects inside must be pre-processed for interaction. Enter RISE — Retrieving Interaction SpacE. It uses BM25 to draw that boundary, and during indexing it transforms documents for shell-style navigation.
Quarter the Cost, No Failures at 1M Documents
RISE-BM25 hits 78% accuracy on BrowseComp-Plus using gpt-5.4-mini — the same as the pure DCI baseline — but at roughly 25% of the per-query cost. Scale up to 1 million documents and RISE manages 81% accuracy. Meanwhile, DCI on gpt-5.4-nano (a weaker model) collapses to 60% accuracy, with 33 out of 100 queries hitting wall-clock timeouts. That’s not just slower; it’s broken.
The numbers make the point: unbounded tool use doesn’t scale. RISE shows that a cheap BM25 boundary plus indexed preprocessing lets an agent navigate a large corpus with shell tools without scanning everything each time. No exotic embeddings, no learned retrievers — just a principled shift in what retrieval is supposed to do.
Expect future agentic search systems to borrow this design: tie the retriever to the tool set, not to the prompt. The interaction space is the right abstraction.
Source: Towards Retrieving Interaction Spaces for Agentic Search
Domain: arxiv.org
Comments load interactively on the live page.