FCGraft Cuts Policy Synthesis Latency 2.3x While Boosting Success Rate 18%

FCGraft reuses cached function-level KV states from CodeLLMs to generate robust robot policies without redundant computation, beating prompt-level caching by 18.31% in task success.

fcgraftcode llmsembodied agentskv cachelarge language modelsrobotics

2.3x faster policy synthesis and an 18.31% higher task success rate—that's what FCGraft delivers over RAGCache by grafting validated code skeletons instead of regenerating from scratch.

Why CodeLLMs Struggle in Open-Domain Robotics

Code-writing LLMs generate executable policies from natural language goals and environmental constraints. Two problems plague generation in open domains: slow prefill computation over long prompts, and fully generative decoding that introduces API mismatches, missing safety guards, and unstable control logic. Repetitive decoding wastes time; fragile outputs waste test cycles.

How FCGraft Grafts Instead of Generates

FCGraft maintains a library of function-level validated code skeletons and their associated Transformer key-value (KV) caches. Given a new task, it retrieves relevant functions and performs cache grafting via two operations: stitching composes cached function segments into a composite policy, and patching locally adapts only necessary code regions for task-specific parameters. Redundant prefill disappears. Validated control structures stay intact.

What 2.3x Latency Reduction Means for Embodied Agents

Compared against RAGCache—a prompt-level caching method—FCGraft cuts policy synthesis latency by 2.3x and lifts task success rate by 18.31 percentage points. For real-time robotics, halving generation time while improving robustness directly translates to safer, more responsive behavior. The cache-grafting paradigm will likely migrate beyond robotics into any domain where LLMs produce structured, safety-critical code under tight latency budgets.

Source: Functional Cache Grafting: Robust and Rapid Code-Policy Synthesis for Embodied Agents
Domain: arxiv.org

FCGraft Cuts Policy Synthesis Latency 2.3x While Boosting Success Rate 18%

Why CodeLLMs Struggle in Open-Domain Robotics

How FCGraft Grafts Instead of Generates

What 2.3x Latency Reduction Means for Embodied Agents

More in Artificial Intelligence