Source linked

Dr-DCI atteint une précision de 73,3% sur Browsecomp-Plus grâce à l'expansion dynamique de l'espace de travail de recherche

En traitant la récupération comme une action d'agent pour tirer des documents dans un espace de travail local, Dr-DCI évite l'instabilité des opérations de coquille à corps complet tout en évoluant de 100K à 10M de documents.

dr dcidirect corpus interactionagentic searchbrowsecomp pluswiki 18retrieval systems

Dr-DCI reaches 73.3% accuracy on the Browsecomp-Plus benchmark, improving over raw DCI by 8 points while slashing tool usage, wall time, and estimated cost.

Why Full-Corpus Shell Commands Break at Scale

Agentic search over large corpora used to mean a trade-off. You can use a retriever like BM25 or ColBERT to rank candidates, but you only get bounded document views. Or you can expose the full corpus as shell-executable operations - Direct Corpus Interaction (DCI) - to filter, compare, and verify across documents. The problem: full-corpus terminal commands become slow and unstable as the corpus grows. Past a few hundred thousand documents, DCI degrades in both performance and reliability.

DR-DCI's Retriever-Steered Workspace

The paper behind Dr-DCI treats retrieval as an agent-callable action for expanding a local workspace. Instead of operating over the entire corpus, the agent dynamically pulls relevant documents into a growing local environment and runs DCI operations there. This design keeps exploration scalable through the retriever, while preserving the local precision of shell-style operations for evidence resolution.

Ranked previews and inter-document DCI are the critical components. Without them, ablation experiments show accuracy drops significantly.

Results: 8-Point Accuracy Gains and 10M+ Document Scaling

On the Browsecomp-Plus benchmark, Dr-DCI hits 71.2% accuracy without workspace-preserving context reset, and 73.3% with it. That beats raw DCI by up to 8.3 points, and it does so with lower tool usage, faster wall time, and lower estimated cost.

Scaling experiments are where the approach really differentiates itself. From 100K to 10M documents, Dr-DCI maintains stable performance while raw DCI becomes unstable and BM25 falls off substantially. The method also handles a 20M-scale file-per-document Wiki-18 QA setting, averaging 63.0 across six benchmarks and outperforming both retrieval-based and trained search-agent baselines.

This architecture means you don't have to choose between recall and local precision. For anyone building agents that need to verify constraints across large document collections, Dr-DCI offers a concrete path that actually works at scale.


Source: Dr-DCI: Scaling Direct Corpus Interaction via Dynamic Workspace Expansion
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.