ANE is an open-source framework for training neural networks directly on Apple's Neural Engine hardware, bypassing Apple's public Core ML toolchain through reverse-engineered private APIs. It provides low-level control over the ANE, enabling developers to compile custom compute graphs into binary kernels, partition transformer model layers into hardware-compatible subgraphs, and share GPU-allocated memory with the Neural Engine via zero-copy IOSurface buffers.
The framework distinguishes itself by offering direct access to hardware performance counters and power telemetry for benchmarking throughput and energy consumption, alongside a quantization pass that converts weights and activations to INT8 precision for reduced memory bandwidth. It also includes a checkpoint-based compile bypass that serialises compiled kernel state to disk, allowing training to resume without recompiling and sidestepping hardware compile-time limits.
ANE provides tools for measuring throughput and power consumption of custom compute graphs, quantizing model weights to INT8, and training transformer models end-to-end on the Neural Engine. The project's documentation covers installation and usage of these capabilities through its reverse-engineered API bindings.