Wayfinder Routes LLM Queries Without a Single Model Call

Wayfinder Router decides in microseconds which prompts go to a local model vs. the cloud, and it never calls another model to make that decision.

No model call to decide the route. That's the headline. Most routers—RouteLLM, NotDiamond, OpenRouter's Auto mode—delegate the routing decision to a trained classifier, an LLM judge, or a hosted API. Every one of those adds latency, cost, and a bit of randomness to the very step meant to save you money. Wayfinder reads prompt structure (length, headings, lists, code blocks) plus lexical difficulty cues like proofs, math, and constraints, and assigns a deterministic score. The recommendation is free and identical every time you present the same prompt.

Structural Scoring, Not Black-Box Classification

Wayfinder's core is a pure structural heuristic: it parses a prompt's shape without any model inference. The lexical cues (proofs, math, constraints) are shipped opt-in because a double-blind test on independently-authored prompts showed the lexical lift doesn't generalize—it catches roughly 20% of unseen hard prompts and loses to a plain word-count baseline. The developers made that call explicitly: raise lexical weights only after calibrating on your own traffic vocabulary.

That honesty matters. The benchmark (make benchmark) runs against RouterBench and RouterArena, and the FAQ candidly notes that Wayfinder is no better than random on RouterBench's short-but-hard items. If a prompt's difficulty is purely semantic—a subtle code snippet or an innocent-looking "what is the 100th prime number?"—structural routing won't catch it. The team leads with the edge that survives the blind test: deterministic, sub-millisecond, offline decisions with zero model calls.

Zero-Setup Demo and Open-Source Calibration

You can try it without installing anything: uvx wayfinder-router chat --dry-run runs a terminal chat that shows every turn's routing decision (● LOCAL / ◆ CLOUD), the structural score, and running cost savings vs. always-cloud. A web UI with a live threshold slider is one pip install and a wayfinder-router webchat --dry-run away. Both surfaces show the feature breakdown and cost saved for every message.

To get real replies, wayfinder-router init scaffolds [gateway.models] for any OpenAI-compatible API endpoint. Pair a local Ollama model with GPT-4o, or run two cloud tiers. No per-provider code, no SDK—just a base_url, model name, and environment key. The whole thing is MIT-licensed and lives in a single GitHub repo at itsthelore/wayfinder-router.

With calibration on your own traffic, Wayfinder becomes a lean cost-saver that eliminates the latency and randomness of judge-model routers.

Source: Wayfinder Router: deterministic routing of queries between local and hosted LLM
Domain: github.com

Wayfinder Routes LLM Queries Without a Single Model Call

Structural Scoring, Not Black-Box Classification

Zero-Setup Demo and Open-Source Calibration

More in Developer Tools