Source linked

SHIFT Framework Relocates Compute, Beats Wafer-Scale LLM Services 5.9x

SHIFT moves running compute contexts to better chiplet locations, achieving up to 12.5x throughput, 58.3% energy-per-bit savings, and 4.9x runtime improvements over wafer-scale LLM services.

shiftchiplet architecturedynamic compute relocationlarge language modelsnetworks on chipmachine learning

A chiplet architecture that relocates entire compute contexts—not just data—across bandwidth domains achieved LLM runtime improvements of 4.9x, throughput gains of 5.9x, and energy-efficiency boosts of 1.8x over state-of-the-art wafer-scale services.

Those numbers come from SHIFT, a dynamic compute relocation framework proposed in a new arXiv paper. SHIFT treats chiplet systems as a collection of functional, memory, and utility chiplets connected by multi-layered routing. Instead of only shuffling data when a computation is far from the data it needs, SHIFT moves the entire compute node context—registers, state, instructions—to a better-positioned utility chiplet.

Topology-Agnostic Compute Migration

SHIFT is topology-agnostic: it works on any chiplet arrangement using a modified shortest-path algorithm for routing, lightened by an ML-assisted policy that infers traffic patterns. The utility chiplets act as intelligent waypoints that can accept a relocated compute context, execute it, and send results onward. This turns the network itself into an active compute fabric.

On random instruction vectors and data patterns, SHIFT achieved relocation success rates from 75.2% to 97.9% across configurations. Average latency improvements ranged from 16.4% to 62.5%, with a maximum of 76.8%. Throughput increased up to 12.5x, power dissipation per unit area dropped ~8%, and energy-per-bit fell up to 58.3%.

Real Workloads: LLMs and Wafer-Scale Comparison

The authors didn't stop at synthetic benchmarks. They ran standard LLM workloads to stress high logic and data density. Compared to wafer-scale LLM services, SHIFT delivered 4.9x faster runtime, 5.9x higher throughput, and 1.8x better energy-efficiency. That beats existing approaches that rely solely on data movement optimization.

What this enables next: a practical path to building large-scale heterogeneous chiplet systems where compute location is fluid, not fixed. SHIFT suggests the future of chiplet interconnects isn't just faster wires—it's smarter compute placement, with the network deciding where to run, not just where to send bits.


Source: SHIFT: Dynamic Compute Relocation Framework for Communication-Aware Chiplet-Based Systems
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.