Source linked

TreeTracer's Sankey Diagrams Expose Hidden LLM Bias Through Stochastic Paths

A new visual analytics tool aggregates hundreds of stochastic generations into syntax-aligned trees, revealing biases like pronoun suppression that single-output audits miss.

treetracergpt 2 xlapertusllm biasvisual analyticsmodel auditing

Standard LLM auditing, whether single-output inspection or static metrics, fails to capture biases hiding in lower-probability generation branches. That blind spot is the entire reason TreeTracer exists.

Stochastic Paths Forced Into a Shared Structure

TreeTracer works by replacing ontology-defined terms in prompts (e.g., swapping gendered pronouns or role nouns), then running hundreds of stochastic generations for each variant. Those generations get folded into a syntax-aligned hierarchical tree via parsing and classification-aware node merging with an auxiliary language model. The result is a custom Sankey diagram that lets you compare two trees side by side, say one prompt context vs another.

The authors validate the workspace against an unaligned baseline model, GPT-2 XL, and the constitutionally aligned Apertus models. Two case studies stand out: counterfactual pronoun suppression (the model avoids certain pronoun pairings in specific contexts) and conversational marginalization (the model systematically downweights tokens associated with certain groups).

Contrastive Inference Keeps You Honest

Any single visualization only reflects a subset of learned behavior, so TreeTracer adds contrastive inference. It computes and directly displays counterfactual token probabilities across contexts, reducing the risk of misreading a sparse branch as bias when it's just noise. A preliminary user study confirms that the aggregated comparative interface cuts cognitive load and helps analysts spot systemic biases faster than poking through raw logits.

This kind of tool matters because we're past the point where a few cherry-picked examples tell the story. TreeTracer won't replace red-teaming, but it gives auditors a way to see the shape of bias across the full probability distribution, not just the surface.


Source: Exposing the Unsaid: Visualizing Hidden LLM Bias through Stochastic Path Aggregation
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.