Source linked

EstRTLの静的機能評価はRTLコードの正確性を3.2-9%上げる

既存のLLMベースのRTLコードジェネレーターは、機能的正確性を無視し、静的スコアリングを備えたEstRTLの3段階のフレームワークは、一般的なLLMで正確性を3.2%から9.0%に引き上げます。

estrtlrtl code generationlarge language modelshardware designfunctional verificationllm agents

LLMs writing RTL code are garbage at functional correctness. They'll generate something that compiles, but will it actually drive the right signals on a real chip? The answer has been "who knows"—until now.

EstRTL, a new framework from an anonymous academic team, adds a static functional score estimation loop to the RTL generation pipeline. Instead of just fine-tuning models or stacking retrieval tricks, they built a three-stage agent: Generation, Estimation, Correction. The key idea is an estimation agent that reads the generated RTL and scores it against a human-readable requirements spec before deciding what to do next.

The Static Score Changes the Game

Most prior work treats RTL generation as a translation problem—Verilog in, Verilog out—with no real check on semantics. EstRTL's estimation agent returns a quantitative score and a list of requirement comparisons. If the score is low, the code goes back for regeneration. If it's borderline, the correction agent takes over. If it passes, out it goes.

That static analysis means no simulation, no testbench orchestration, no runtime cost. It's a lightweight functional lint on steroids. The framework is model-agnostic and works with any LLM designed for RTL generation. Experiments show correctness gains of 3.2% to 9.0% across generic LLMs—no special fine-tuning required.

Real Transparency for AI Hardware Design

Beyond the numbers, the human-readable requirement comparisons change how engineers trust the output. You don't get a black-box RTL blob; you get a scorecard explaining what the generated code does and doesn't satisfy. That alone makes the framework useful for design review, even if the correction agent's edits need manual verification.

The authors open-sourced the code and experimental results at an anonymous repo (https://anonymous.4open.science/status/EstRTL-E200/). If you're working on LLM-assisted hardware design, this is the first tool I've seen that actually tries to verify intent—not just syntax.


Source: EstRTL: Functional Estimation Guided RTL Code Generation
Domain: arxiv.org

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.