Source linked

TASR Retains 95% F1 While Slashing Retrieval Calls by 37%

A training-free stopping rule based on logit margin beats fixed-k baselines and learned stop policies across 24 configurations.

tasriterative retrievalretrieval augmented generationarxivtraining freestopping rule

Iterative RAG agents waste 37% of their retrieval calls on rounds that change neither the answer nor the evidence. TASR, a one-line stopping rule from the authors of arXiv:2606.13814, cuts that waste without any training or fine-tuning.

How TASR's One-Line Rule Works

The rule fires when two conditions hold: the model repeats its previous-round normalized answer, and the isotonically calibrated logit margin exceeds 0.25. No classifier, no value head, no learned policy. On a 3-model x 2-dataset distractor grid, TASR retains 94.8% of fixed-k=5's macro F1 while making only 62.6% of its calls. Against fixed-k=3, it gains +3.42 F1 at essentially the same call count.

That pattern holds across nine open-domain BM25 cells: 55.01 F1 at 2.98 calls versus 54.33 at 3.00 for fixed-k=3. On nine dense-retrieval cells spanning two retriever families, zero significant regressions appear. The threshold 0.25 was not tuned per task; it was fixed once and never touched.

Why Logit Margins Beat Verbalized Confidence

The authors expose why verbalized confidence fails on RLHF-tuned models: 96.5% of values equal 5, giving an entropy of just 0.182 nats. Logit margins achieve 44x better class-conditional separation. That gap is measurable, reproducible, and grounded in a concrete model pathology.

TASR was selected from an exhaustive enumeration of 381 candidate stopping rules. No alternative Pareto-dominates it on any evaluated configuration. That is a strong claim: among hundreds of possible predicates, this one wins without a single tradeoff.

A Pareto Baseline for Future Controllers

TASR does not claim to be optimal; it provides an auditable, training-free Pareto baseline. Any learned stopping controller that cannot beat this one-line rule on both F1 and call count is not worth the training cost. Code is public for reproduction.


Source: TASR: Training-Free Adaptive Stopping for Iterative Retrieval
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.