GPT-4o alone fixes only 54.3% of complex Verilog bugs on NVIDIA's CVDP benchmark. With VeriPilot, that number jumps to 85.71%.
That 31-percentage-point gain comes from a simple insight most end-to-end LLM debuggers miss: the root cause of a bug is often far from the failing test output. VeriPilot, described in a new arXiv paper from researchers at (presumably) multiple institutions, doesn't just feed compiler errors back to the LLM and hope for the best.
Golden models, not black-box outputs
VeriPilot relies on a golden reference model-essentially a correct but possibly non-synthesizable implementation of the same circuit. Instead of comparing only final outputs, the framework aligns internal variable semantics between the buggy Verilog and the golden model using LLM-based analysis. That lets it pinpoint exactly where signals diverge.
Once the divergence point is known, VeriPilot builds Control-Data-Flow Graphs (CDFGs) from static analysis and traces the signal path step by step. The output is a minimal set of suspicious code regions plus the correct corresponding logic from the golden model. The LLM gets structured, localized context instead of a wall of source code.
Results that speak for themselves
On the CVDP benchmark suite from NVIDIA, GPT-4o with VeriPilot achieves 85.71% repair success rate, up from 54.3% for GPT-4o alone. The paper doesn't report F1 scores for bug localization, but the repair improvement is large enough that the localization must be working.
The authors released the source code and benchmark on GitHub at https://github.com/YihanWn/VeriPilot.git. If you write hardware in Verilog or maintain EDA tooling, that repo is worth a clone before the weekend.
The next step is obvious: apply the same golden-model-guided signal tracing to other HDLs like VHDL or SystemVerilog assertions. If CDFG-based debugging generalizes beyond Verilog, the hardware verification cycle just got a lot shorter.
Source: VeriPilot: An LLM-Powered Verilog Debugging Framework
Domain: arxiv.org
Comments load interactively on the live page.