Standard LLM auditing, whether single-output inspection or static metrics, fails to capture biases hiding in lower-probability generation branches. That blind spot is the entire reason TreeTracer exists.
Stochastic Paths Forced Into a Shared Structure
TreeTracer works by replacing ontology-defined terms in prompts (e.g., swapping gendered pronouns or role nouns), then running hundreds of stochastic generations for each variant. Those generations get folded into a syntax-aligned hierarchical tree via parsing and classification-aware node merging with an auxiliary language model. The result is a custom Sankey diagram that lets you compare two trees side by side, say one prompt context vs another.
The authors validate the workspace against an unaligned baseline model, GPT-2 XL, and the constitutionally aligned Apertus models. Two case studies stand out: counterfactual pronoun suppression (the model avoids certain pronoun pairings in specific contexts) and conversational marginalization (the model systematically downweights tokens associated with certain groups).
Contrastive Inference Keeps You Honest
Any single visualization only reflects a subset of learned behavior, so TreeTracer adds contrastive inference. It computes and directly displays counterfactual token probabilities across contexts, reducing the risk of misreading a sparse branch as bias when it's just noise. A preliminary user study confirms that the aggregated comparative interface cuts cognitive load and helps analysts spot systemic biases faster than poking through raw logits.
This kind of tool matters because we're past the point where a few cherry-picked examples tell the story. TreeTracer won't replace red-teaming, but it gives auditors a way to see the shape of bias across the full probability distribution, not just the surface.
Source: Exposing the Unsaid: Visualizing Hidden LLM Bias through Stochastic Path Aggregation
Domain: arxiv.org
Comments load interactively on the live page.