Source linked

FCGraft réduit la latence de synthèse des politiques de 2,3 fois tout en augmentant le taux de réussite de 18%

FCGraft réutilise les états KV au niveau de la fonction cachés provenant de CodeLLM pour générer des politiques robustes de robots sans calcul redondant, battant le cache au niveau prompt de 18,31% dans le succès de la tâche.

fcgraftcode llmsembodied agentskv cachelarge language modelsrobotics

2.3x faster policy synthesis and an 18.31% higher task success rate—that's what FCGraft delivers over RAGCache by grafting validated code skeletons instead of regenerating from scratch.

Why CodeLLMs Struggle in Open-Domain Robotics

Code-writing LLMs generate executable policies from natural language goals and environmental constraints. Two problems plague generation in open domains: slow prefill computation over long prompts, and fully generative decoding that introduces API mismatches, missing safety guards, and unstable control logic. Repetitive decoding wastes time; fragile outputs waste test cycles.

How FCGraft Grafts Instead of Generates

FCGraft maintains a library of function-level validated code skeletons and their associated Transformer key-value (KV) caches. Given a new task, it retrieves relevant functions and performs cache grafting via two operations: stitching composes cached function segments into a composite policy, and patching locally adapts only necessary code regions for task-specific parameters. Redundant prefill disappears. Validated control structures stay intact.

What 2.3x Latency Reduction Means for Embodied Agents

Compared against RAGCache—a prompt-level caching method—FCGraft cuts policy synthesis latency by 2.3x and lifts task success rate by 18.31 percentage points. For real-time robotics, halving generation time while improving robustness directly translates to safer, more responsive behavior. The cache-grafting paradigm will likely migrate beyond robotics into any domain where LLMs produce structured, safety-critical code under tight latency budgets.


Source: Functional Cache Grafting: Robust and Rapid Code-Policy Synthesis for Embodied Agents
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.