Source linked

Undocumented Direct Route to Apple Neural Engine Mapped from A11 to M5

A reverse-engineering effort documents a direct user-space interface to the ANE, bypassing Core ML, across every Apple chip generation from A11 through M5, with detailed roofline and dispatch analysis.

appleapple neural enginecore mlreverse engineeringchip architectureon device ai

A new reverse-engineering report documents the Apple Neural Engine's internals across every chip generation from the A11 to the A18 and M1 to M5, including a direct user-space invocation path that Apple never documented.

What the ANE Actually Does and How It's Structured

The Apple Neural Engine is a fixed-function matrix accelerator shipped in every iPhone, iPad, and Mac SoC since the A11. The report reverse-engineers the datapath and roofline that bound its throughput and energy, compiling per-chip target tables and an operation-by-device matrix. Direct measurements come from M1 and M5 hardware; claims are labeled as measured, decompile-derived, or predicted. The analysis covers the private runtime, compiler, kernel driver, firmware, and the on-disk program format, including the weight-compression scheme.

The Undocumented Direct Path and Why It Matters

Core ML is Apple's supported framework for the ANE, but the report reveals a direct dispatch route below Core ML callable from ordinary user space. That path is undocumented, unsupported, and version-fragile. The report explicitly cautions it is intended for measurement, research, and on-device work, not for shipping software. For anyone doing low-level performance analysis or custom neural network deployment on Apple silicon, this path is a gold mine - just don't ship it.

Measured Performance and Open Questions

The report documents the dispatch route, command protocol, and static analysis of the kernel driver and firmware. Per-chip performance characteristics are provided for A11 through A18 and M1 through M5, with the direct measurements on M1 and M5 serving as ground truth. Open questions and methodology are recorded, giving future researchers a clear starting point. Anyone working on on-device AI optimization now has a definitive reference for Apple's neural accelerator, down to the undocumented bytes.


Source: Apple Neural Engine: Architecture, Programming, and Performance
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.