MAGE: A Multi-Agent Engine for Automated RTL Code Generation
- URL: http://arxiv.org/abs/2412.07822v1
- Date: Tue, 10 Dec 2024 21:53:55 GMT
- Title: MAGE: A Multi-Agent Engine for Automated RTL Code Generation
- Authors: Yujie Zhao, Hejia Zhang, Hanxian Huang, Zhongming Yu, Jishen Zhao,
- Abstract summary: MAGE is the first open-source multi-agent AI system designed for robust and accurate Verilog RTL code generation.<n>MAGE achieves a 95.7% rate of syntactic and functional correctness code generation on VerilogEval-Human 2 benchmark.
- Score: 5.899673582879575
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The automatic generation of RTL code (e.g., Verilog) through natural language instructions has emerged as a promising direction with the advancement of large language models (LLMs). However, producing RTL code that is both syntactically and functionally correct remains a significant challenge. Existing single-LLM-agent approaches face substantial limitations because they must navigate between various programming languages and handle intricate generation, verification, and modification tasks. To address these challenges, this paper introduces MAGE, the first open-source multi-agent AI system designed for robust and accurate Verilog RTL code generation. We propose a novel high-temperature RTL candidate sampling and debugging system that effectively explores the space of code candidates and significantly improves the quality of the candidates. Furthermore, we design a novel Verilog-state checkpoint checking mechanism that enables early detection of functional errors and delivers precise feedback for targeted fixes, significantly enhancing the functional correctness of the generated RTL code. MAGE achieves a 95.7% rate of syntactic and functional correctness code generation on VerilogEval-Human 2 benchmark, surpassing the state-of-the-art Claude-3.5-sonnet by 23.3 %, demonstrating a robust and reliable approach for AI-driven RTL design workflows.
Related papers
- IFEvalCode: Controlled Code Generation [69.28317223249358]
The paper introduces forward and backward constraints generation to improve the instruction-following capabilities of Code LLMs.<n>The authors present IFEvalCode, a multilingual benchmark comprising 1.6K test samples across seven programming languages.
arXiv Detail & Related papers (2025-07-30T08:08:48Z) - Function-to-Style Guidance of LLMs for Code Translation [59.487054943812836]
We propose F2STrans, a function-to-style guiding paradigm designed to improve the performance of large language models in code translation.<n>Our approach comprises two key stages: (1) Functional learning, which optimize translation correctness using high-quality source-target code pairs.<n>We introduce a novel code translation benchmark that includes up-to-date source code, extensive test cases, and manually annotated ground-truth translations.
arXiv Detail & Related papers (2025-07-15T08:25:02Z) - Grammar-Guided Evolutionary Search for Discrete Prompt Optimisation [63.97051732013936]
We propose an evolutionary search approach to automated discrete prompt optimisation consisting of two phases.<n>In the first phase, grammar-guided genetic programming is invoked to synthesise prompt-creating programmes.<n>In the second phase, local search is applied to explore the neighbourhoods of best-performing programmes.
arXiv Detail & Related papers (2025-07-14T14:34:15Z) - Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z) - RTL++: Graph-enhanced LLM for RTL Code Generation [0.0]
Traditional register transfer level (RTL) design methods are manual, time-consuming, and prone to errors.<n>Open-source models offer alternatives; however, they frequently fall short in quality/correctness.<n>This paper proposes RTL++, a first-of-its-kind LLM-assisted method for RTL code generation.
arXiv Detail & Related papers (2025-05-11T00:17:26Z) - VeriMind: Agentic LLM for Automated Verilog Generation with a Novel Evaluation Metric [4.590930025882158]
We propose VeriMind, an agentic LLM framework for Verilog code generation.
We introduce a novel evaluation metric-pass@ARC-which combines the conventional pass@k measure with Average Refinement Cycles (ARC) to capture both success rate and the efficiency of iterative refinement.
Experimental results on diverse hardware design tasks demonstrated that our approach achieved up to $8.3%$ improvement on pass@k metric and $8.1%$ on pass@ARC metric.
arXiv Detail & Related papers (2025-03-15T23:43:06Z) - RTLSquad: Multi-Agent Based Interpretable RTL Design [3.1734541757969463]
Large Language Models (LLMs) offer new approaches for automatic RTL code generation and optimization.
To address this, we propose RTLSquad, a novel LLM-Based Multi-Agent system for interpretable RTL code generation.
arXiv Detail & Related papers (2025-01-06T02:57:54Z) - EDA-Aware RTL Generation with Large Language Models [0.7831852829409273]
Large Language Models (LLMs) have become increasingly popular for generating RTL code.
producing error-free RTL code in a zero-shot setting remains highly challenging for even state-of-the-art LLMs.
We introduce AIvril2, a self-verifying, LLM-agnostic agentic framework aimed at enhancing RTL code generation through iterative corrections of both syntax and functional errors.
arXiv Detail & Related papers (2024-11-21T00:37:51Z) - RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance [0.6062751776009752]
Large Language Models (LLMs) have shown incredible potential in code generation tasks.
LLMs can generate code based on task descriptions, but accuracy remains limited.
We introduce a novel architecture of LLM-based agents for code generation and automatic debug: Refinement and Guidance debugger (RGD)
RGD decomposes the code generation task into multiple steps, ensuring a clearer workflow and enabling iterative code refinement based on self-reflection and feedback.
arXiv Detail & Related papers (2024-10-02T05:07:02Z) - AIvril: AI-Driven RTL Generation With Verification In-The-Loop [0.7831852829409273]
Large Language Models (LLMs) are computational models capable of performing complex natural language processing tasks.
This paper introduces AIvril, a framework designed to enhance the accuracy and reliability of RTL-aware LLMs.
arXiv Detail & Related papers (2024-09-03T15:07:11Z) - VerilogCoder: Autonomous Verilog Coding Agents with Graph-based Planning and Abstract Syntax Tree (AST)-based Waveform Tracing Tool [4.027984601764008]
We propose VerilogCoder, a system of multiple Artificial Intelligence (AI) agents for Verilog code generation.
The proposed methodology successfully generates 94.2% syntactically and functionally correct Verilog code.
arXiv Detail & Related papers (2024-08-15T20:06:06Z) - OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection [54.775409528658486]
OriGen is a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology.
Our approach employs a code-tocode augmentation technique to enhance the quality of open-source RTL code datasets.
arXiv Detail & Related papers (2024-07-23T07:22:25Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models [54.14602121129874]
We introduce AutoIF, the first scalable and reliable method for automatically generating instruction-following training data.
AutoIF transforms the validation of instruction-following data quality into code verification.
arXiv Detail & Related papers (2024-06-19T13:29:53Z) - SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents [50.82665351100067]
FlowGen is a code generation framework that emulates software process models based on multiple Large Language Model (LLM) agents.
We evaluate FlowGenScrum on four benchmarks: HumanEval, HumanEval-ET, MBPP, and MBPP-ET.
arXiv Detail & Related papers (2024-03-23T14:04:48Z) - Make Every Move Count: LLM-based High-Quality RTL Code Generation Using
MCTS [20.135906487081453]
We present an automated transformer decoding algorithm that integrates Monte Carlo tree-search for lookahead.
For the largest design generated by the state-of-the-art LLM (16-bit adder), our technique can achieve a 31.8% improvement in the area-delay product.
arXiv Detail & Related papers (2024-02-05T18:47:04Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.