23 microjoules per image. That's the energy cost of running a generative model on an oscillator-based analog platform, two orders of magnitude below digital baselines, with a 27.6 FID on MNIST.
The Expressivity Gap Between Physics and Software
Analog hardware like coupled oscillators and Ising Machines solves differential equations at a fraction of digital's energy. But those equations are fixed by physics, not software. Modern generative models expect flexible dynamics -- backpropagation, attention, arbitrary layers. That fundamental mismatch has kept analog accelerators on the sidelines for generative AI.
The authors from the Analog Interaction Systems (AIS) paper empirically characterize this gap. Their key finding: a simple oscillator network without training tricks generates outputs that look like static. The FID scores are terrible -- around 100+ on MNIST. Prior work tried to approximate standard layers with analog circuits, but that burned the energy advantage.
Closing the Gap With Time-Varying Parameters and Hidden States
The AIS framework introduces two mechanisms that don't violate hardware constraints. First, time-varying piecewise parameters: you can slowly modulate coupling strengths or natural frequencies during inference, as long as the change is piecewise-constant and slow compared to oscillator dynamics. Second, hidden physical states: some oscillators are never read out; they serve as a reservoir to increase effective capacity.
Combined with a Wasserstein GAN training objective that doesn't require the model to follow a specific trajectory, the system learns to generate coherent images. The result: FID 27.6 on MNIST and 80.8 on Fashion-MNIST. That's 3-4x better than any previous analog generative model running on realistic hardware constraints.
Why 4-Bit Sparse Architectures Matter for Practical Analog AI
The paper doesn't just show numbers; it gives a realistic bill of materials. Sparse connectivity and low-bit-width quantized parameters (4-bit) are necessary to keep area and power manageable. For their chosen architecture, total energy per generated image hits 23uJ. Compare that to a digital GPU inference -- typically around 2-3 mJ for a small generator -- and you're looking at a 100x reduction.
Of course, these numbers come with caveats: MNIST and Fashion-MNIST are toy datasets. The real test will be something like CIFAR-10 or higher-resolution images. But the framework is architecture-agnostic. The same AIS principles apply to other physical substrates like optical or memristor arrays.
With the expressivity gap quantified and two concrete mechanisms to close it, analog generative AI finally has a roadmap beyond physics-fixed toy problems. Next step: scaling the architecture to larger images while keeping that 23uJ per image target.
Source: Generative Models on Analog Hardware with Dynamics
Domain: arxiv.org
Comments load interactively on the live page.