Related papers: CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair

CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair

URL: http://arxiv.org/abs/2409.12993v1
Date: Thu, 19 Sep 2024 12:15:55 GMT
Title: CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
Authors: Mingjie Liu, Yun-Da Tsai, Wenfei Zhou, Haoxing Ren,
Abstract summary: This paper first presents an analysis of fine-tuned LLMs on Verilog coding, with synthetic data from prior methods. We identify two main issues: difficulties in handling non-textual representations and significant variability during training with models randomly making "minor" mistakes. Our fine-tuned Starcoder2-15B outperforms prior state-of-the-art results by 3.8%, 10.9%, 6.6% for pass@1 on VerilogEval-Machine, VerilogEval-Human, and RTLLM.
Score: 4.554742043916029
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the significant progress made in code generation with large language models, challenges persist, especially with hardware description languages such as Verilog. This paper first presents an analysis of fine-tuned LLMs on Verilog coding, with synthetic data from prior methods. We identify two main issues: difficulties in handling non-textual representations (Karnaugh maps, state-transition diagrams and waveforms) and significant variability during training with models randomly making "minor" mistakes. To address these limitations, we enhance data curation by creating correct-by-construction data targeting non-textual representations. Additionally, we introduce an automated framework that generates error reports from various model checkpoints and injects these errors into open-source code to create targeted code repair data. Our fine-tuned Starcoder2-15B outperforms prior state-of-the-art results by 3.8%, 10.9%, 6.6% for pass@1 on VerilogEval-Machine, VerilogEval-Human, and RTLLM.

Related papers

Infinite-Instruct: Synthesizing Scaling Code instruction Data with Bidirectional Synthesis and Static Verification [9.332807762710127]
We introduce Infinite-Instruct, an automated framework for high-quality question-answer pairs.<n>The framework focuses on improving the internal logic of synthesized problems.<n>Cross-lingual static code analysis pipeline filters invalid samples to ensure data quality.
arXiv Detail & Related papers (2025-05-29T07:14:43Z)
VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation [9.07044866283158]
We introduce VeriReason, a framework integrating supervised fine-tuning with Guided Reward Proximal Optimization (GRPO) reinforcement learning for RTL generation.<n>On the VerilogEval Benchmark, VeriReason delivers 83.1% functional correctness, substantially outperforming both comparable-sized models and much larger commercial systems like GPT-4 Turbo.<n>VeriReason represents the first system to successfully integrate explicit reasoning capabilities with reinforcement learning for Verilog generation, establishing a new state-of-the-art for automated RTL synthesis.
arXiv Detail & Related papers (2025-05-17T05:25:01Z)
Speculative Decoding for Verilog: Speed and Quality, All in One [14.64921497909531]
We introduce a novel application of speculative decoding for Verilog code generation. Unlike standard tokenization schemes, our approach aligns decoding stops with syntactically significant tokens. Our experimental results show that our method achieves up to a 5.05x speedup in Verilog code generation.
arXiv Detail & Related papers (2025-03-18T11:21:53Z)
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding [49.56049319037421]
KodCode is a synthetic dataset that addresses the persistent challenge of acquiring high-quality, verifiable training data. It comprises question-solution-test triplets that are systematically validated via a self-verification procedure. This pipeline yields a large-scale, robust and diverse coding dataset.
arXiv Detail & Related papers (2025-03-04T19:17:36Z)
Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks. However, improvement is plateauing due to the exhaustion of readily available high-quality data. We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z)
DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model [13.532046953850902]
We present DeepRTL, a unified representation model that excels in both Verilog understanding and generation. Based on CodeT5+, DeepRTL is fine-tuned on a comprehensive dataset that aligns Verilog code with rich, multi-level natural language descriptions. We introduce the first benchmark for Verilog understanding and take the initiative to apply embedding similarity and GPT Score to evaluate the models' understanding capabilities.
arXiv Detail & Related papers (2025-02-20T11:07:55Z)
RIRO: Reshaping Inputs, Refining Outputs Unlocking the Potential of Large Language Models in Data-Scarce Contexts [0.0]
Large language models (LLMs) have significantly advanced natural language processing, excelling in areas like text generation, summarization, and question-answering. Despite their capabilities, these models face challenges when fine-tuned on small, domain-specific datasets. We introduce RIRO, a novel two-layer architecture designed to improve performance in data-scarce environments.
arXiv Detail & Related papers (2024-12-15T15:48:37Z)
Contextualized Data-Wrangling Code Generation in Computational Notebooks [131.26365849822932]
We propose an automated approach, CoCoMine, to mine data-wrangling code generation examples with clear multi-modal contextual dependency. We construct CoCoNote, a dataset containing 58,221 examples for Contextualized Data-wrangling Code generation in Notebooks. Experiment results demonstrate the significance of incorporating data context in data-wrangling code generation.
arXiv Detail & Related papers (2024-09-20T14:49:51Z)
VerilogCoder: Autonomous Verilog Coding Agents with Graph-based Planning and Abstract Syntax Tree (AST)-based Waveform Tracing Tool [4.027984601764008]
We propose VerilogCoder, a system of multiple Artificial Intelligence (AI) agents for Verilog code generation. The proposed methodology successfully generates 94.2% syntactically and functionally correct Verilog code.
arXiv Detail & Related papers (2024-08-15T20:06:06Z)
Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework [50.02710905062184]
This paper proposes an automated design-data augmentation framework, which generates high-volume and high-quality natural language aligned with Verilog and EDA scripts. The accuracy of Verilog generation surpasses that of the current state-of-the-art open-source Verilog generation model, increasing from 58.8% to 70.6% with the same benchmark.
arXiv Detail & Related papers (2024-03-17T13:01:03Z)
Leveraging Print Debugging to Improve Code Generation in Large Language Models [63.63160583432348]
Large language models (LLMs) have made significant progress in code generation tasks. But their performance in tackling programming problems with complex data structures and algorithms remains suboptimal. We propose an in-context learning approach that guides LLMs to debug by using a "print debug" method.
arXiv Detail & Related papers (2024-01-10T18:37:59Z)
Neuron Patching: Semantic-based Neuron-level Language Model Repair for Code Generation [32.178931149612644]
ulModel ulImprovement via ulNeuron ulTargeting (textscMINT) is a novel approach for repairing code Language Models (LMs) textscMINT is effective, efficient, and reliable, capable of correcting a neural model by patching a minimum number of neurons.
arXiv Detail & Related papers (2023-12-08T20:28:08Z)
LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system. We build a novel data-cleaning pipeline that uses these principles to transform existing programs. We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z)
VerilogEval: Evaluating Large Language Models for Verilog Code Generation [6.88526119890374]
We present a comprehensive evaluation dataset consisting of 156 problems from the Verilog instructional website HDLBits. The evaluation set consists of a diverse set of Verilog code generation tasks, ranging from simple combinational circuits to complex finite state machines.
arXiv Detail & Related papers (2023-09-14T09:15:34Z)
VeriGen: A Large Language Model for Verilog Code Generation [22.837558083876743]
We fine-tune pre-existing Large Language Models (LLMs) on Verilog datasets compiled from GitHub and Verilog textbooks. Here, our fine-tuned open-source CodeGen-16B model outperforms the commercial state-of-the-art GPT-3.5-turbo model with a 1.1% overall increase. Notably, it demonstrates a 41% improvement in generating syntactically correct Verilog code across various problem categories compared to its pre-trained counterpart.
arXiv Detail & Related papers (2023-07-28T02:57:14Z)
Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation. We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z)
Benchmarking Large Language Models for Automated Verilog RTL Code Generation [21.747037230069854]
We characterize the ability of large language models (LLMs) to generate useful Verilog. We construct an evaluation framework comprising test-benches for functional analysis and a flow to test the syntax of Verilog code. Our findings show that across our problem scenarios, the fine-tuning results in LLMs more capable of producing syntactically correct code.
arXiv Detail & Related papers (2022-12-13T16:34:39Z)
Robust and Transferable Anomaly Detection in Log Data using Pre-Trained Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users. We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.