Source linked

TASR conserve 95% de F1 tout en réduisant les appels de récupération de 37%

Une règle d’arrêt sans formation basée sur la marge logit bat les lignes de base fixes et apprend les politiques d’arrêt sur 24 configurations.

tasriterative retrievalretrieval augmented generationarxivtraining freestopping rule

Iterative RAG agents waste 37% of their retrieval calls on rounds that change neither the answer nor the evidence. TASR, a one-line stopping rule from the authors of arXiv:2606.13814, cuts that waste without any training or fine-tuning.

How TASR's One-Line Rule Works

The rule fires when two conditions hold: the model repeats its previous-round normalized answer, and the isotonically calibrated logit margin exceeds 0.25. No classifier, no value head, no learned policy. On a 3-model x 2-dataset distractor grid, TASR retains 94.8% of fixed-k=5's macro F1 while making only 62.6% of its calls. Against fixed-k=3, it gains +3.42 F1 at essentially the same call count.

That pattern holds across nine open-domain BM25 cells: 55.01 F1 at 2.98 calls versus 54.33 at 3.00 for fixed-k=3. On nine dense-retrieval cells spanning two retriever families, zero significant regressions appear. The threshold 0.25 was not tuned per task; it was fixed once and never touched.

Why Logit Margins Beat Verbalized Confidence

The authors expose why verbalized confidence fails on RLHF-tuned models: 96.5% of values equal 5, giving an entropy of just 0.182 nats. Logit margins achieve 44x better class-conditional separation. That gap is measurable, reproducible, and grounded in a concrete model pathology.

TASR was selected from an exhaustive enumeration of 381 candidate stopping rules. No alternative Pareto-dominates it on any evaluated configuration. That is a strong claim: among hundreds of possible predicates, this one wins without a single tradeoff.

A Pareto Baseline for Future Controllers

TASR does not claim to be optimal; it provides an auditable, training-free Pareto baseline. Any learned stopping controller that cannot beat this one-line rule on both F1 and call count is not worth the training cost. Code is public for reproduction.


Source: TASR: Training-Free Adaptive Stopping for Iterative Retrieval
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.