Compilado una vez, diferenciado en todas partes: Interpretación de esquemas que retransmite a través de programas

171 recursive and higher-order program-seed pairs matched direct compilation to numerical precision — that's the claim from a new compiler that turns a self-hosting subset of Scheme into differentiable computation graphs for autograd backends.

I've seen plenty of attempts to blur the line between program execution and gradient optimization. This one actually delivers by compiling a meta-circular interpreter: a Scheme evaluator written in Scheme, translated once into a differentiable graph. Feed it any program as data, and reverse-mode autodiff flows gradients to the continuous constants embedded in that program. No recompilation, no custom gradient machinery, and you keep closures, recursion, and data structures.

How DMCI Works: Compile Once, Differentiate Every Program

Standard autograd systems require the optimized parameters to be part of the computation graph from the start. DMCI inverts that: the interpreter itself is the frozen graph; the program is dynamic input. Because the subset is self-hosting (it can compile its own evaluator), the result is differentiable meta-circular interpretation. The authors prove the gradients are correct almost everywhere — a formal guarantee that the reverse-mode derivatives through the interpreted program match what you'd get from direct compilation of that program.

They didn't just prove it on paper. Across 171 recursive and higher-order program-seed pairs, the numerical gradients from DMCI matched direct compilation to machine epsilon. That's the kind of concrete validation that makes me trust the approach.

Program-and-Parameter Co-Search with LLMs and Gradients

Here's where it gets practical. The authors combine DMCI with a large language model that proposes Scheme programs. An outer loop generates discrete program structures; DMCI supplies exact gradients to calibrate each candidate's continuous parameters through the single frozen interpreter. This is OpenEvolve-style search without the hand-rolled differentiation.

On battery capacity-fade data, the search recovered a knee-like degradation structure and improved held-out extrapolation over hand-crafted baselines on the harder early-extrapolation split, matching them on the later split. On a high-dimensional El Niño inverse problem, DMCI optimized an interpreted Kalman-filter likelihood where gradient-free search completely failed.

These results extend symbolic regression and neurosymbolic search from closed-form expressions to executable, stateful programs. Model-generated code is now directly optimizable against data — compile once, differentiate everywhere.

Source: Compile Once, Differentiate Everywhere: A Differentiable Meta-Circular Interpreter
Domain: arxiv.org

Compilado una vez, diferenciado en todas partes: Interpretación de esquemas que retransmite a través de programas

How DMCI Works: Compile Once, Differentiate Every Program

Program-and-Parameter Co-Search with LLMs and Gradients

More in Machine Learning