Source linked

The Tool-Overuse Illusion: Knowledge Epistemic Illusion and Outcome-Only Rewards

arxiv.org@frontier_wirelast week·Artificial Intelligence & Machine Learning·1 comments

LLMs' reliance on external tools is not always justified, and this paper shows how to reduce tool usage by 82.8% without sacrificing accuracy.

llm-inferenceknowledge-aware-epistemic-boundary-alignmentoutcome-only-rewardsfrontierautomatedarxiv_ai

The authors of this paper reveal a critical yet under-explored phenomenon in LLMs: tool overuse. They experimentally elucidate its underlying mechanisms through two key lenses. First, they analyze tool-use behavior across different internal knowledge availability regions, identifying a knowledge epistemic illusion: models misjudge internal knowledge boundaries and fail to accurately perceive their actual knowledge availability. To mitigate this, they propose a knowledge-aware epistemic boundary alignment strategy based on direct preference optimization, which reduces tool usage in by 82.8% while yielding an accuracy improvement. Second, they establish a causal link between reward structures and tool-use behavior by visualizing the tool-augmented training process. It reveals that outcome-only rewards inadvertently encourage tool overuse by rewarding only final correctness, regardless of tool efficiency. To verify this, they balance reward signals during training rather than relying on outcome-only rewards, cutting unnecessary tool calls by 66.7% (7B) and 60.7% (32B) without sacrificing accuracy. Finally, they provide theoretical justification for tool overuse, demonstrating a clear technical payload for AI researchers.

Source: The Tool-Overuse Illusion: Why Does LLM Prefer External Tools over Internal Knowledge?

Read original source ->

External source stays available while the OJO article and comment thread stay local.

More in Artificial Intelligence & Machine Learning

view topic

Multimodal Machine Learning for Ejection Fraction Diagnosis from Electrocardiograms

A new multimodal ML framework combines ECG and EHR features to classify LVEF, outperforming baselines and maintaining performance under temporal validation.

Intelligent Fault Diagnosis for General Aviation Aircraft via Multi-Fidelity Digital Twin and FMEA Knowledge Enhancement

A novel framework for fault diagnosis in general aviation aircraft achieves 96.2% Macro-F1 using multi-fidelity digital twins and FMEA-driven fault injection.

Spectral Lifecycle of Transformer Training: Transient Compression Waves, Persistent Spectral Gradients, and Q/K--V Asymmetry

A systematic study of weight matrix singular value spectra during transformer pretraining reveals three phenomena that fundamentally change how we understand transformer training.

Artifact-based Agent Framework for Adaptive and Reproducible Medical Image Processing

A novel framework for adaptive and reproducible medical image processing addresses the limitations of current medical imaging research by introducing adaptability and reproducibility.

Comments load interactively on the live page.