VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation
- URL: http://arxiv.org/abs/2406.04379v1
- Date: Thu, 6 Jun 2024 00:06:50 GMT
- Title: VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation
- Authors: Prashanth Vijayaraghavan, Luyao Shi, Stefano Ambrogio, Charles Mackin, Apoorva Nitsure, David Beymer, Ehsan Degan,
- Abstract summary: This paper introduces a comprehensive evaluation framework designed specifically for assessing VHDL code generation task.
This dataset is constructed by translating a collection of Verilog evaluation problems to VHDL and aggregating publicly available VHDL problems, resulting in a total of 202 problems.
To assess the functional correctness of the generated VHDL code, we utilize a curated set of self-verifying testbenches.
- Score: 4.700008016247411
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the unprecedented advancements in Large Language Models (LLMs), their application domains have expanded to include code generation tasks across various programming languages. While significant progress has been made in enhancing LLMs for popular programming languages, there exists a notable gap in comprehensive evaluation frameworks tailored for Hardware Description Languages (HDLs), particularly VHDL. This paper addresses this gap by introducing a comprehensive evaluation framework designed specifically for assessing LLM performance in VHDL code generation task. We construct a dataset for evaluating LLMs on VHDL code generation task. This dataset is constructed by translating a collection of Verilog evaluation problems to VHDL and aggregating publicly available VHDL problems, resulting in a total of 202 problems. To assess the functional correctness of the generated VHDL code, we utilize a curated set of self-verifying testbenches specifically designed for those aggregated VHDL problem set. We conduct an initial evaluation of different LLMs and their variants, including zero-shot code generation, in-context learning (ICL), and Parameter-efficient fine-tuning (PEFT) methods. Our findings underscore the considerable challenges faced by existing LLMs in VHDL code generation, revealing significant scope for improvement. This study emphasizes the necessity of supervised fine-tuning code generation models specifically for VHDL, offering potential benefits to VHDL designers seeking efficient code generation solutions.
Related papers
- Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis [49.998130983414924]
Large language models (LLMs) can be employed for programming languages such as Python and C++.
This paper explores leveraging LLMs to generate High-Level Synthesis (HLS)-based hardware design.
arXiv Detail & Related papers (2025-02-19T17:53:59Z) - Case2Code: Scalable Synthetic Data for Code Generation [105.89741089673575]
Large Language Models (LLMs) have shown outstanding breakthroughs in code generation.
Recent work improves code LLMs by training on synthetic data generated by some powerful LLMs.
We propose a textbfCase2Code task by exploiting the expressiveness and correctness of programs.
arXiv Detail & Related papers (2024-07-17T11:35:00Z) - CodeV: Empowering LLMs for Verilog Generation through Multi-Level Summarization [37.4446786461791]
This paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs.
We show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively.
arXiv Detail & Related papers (2024-07-15T03:57:20Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - Classification-Based Automatic HDL Code Generation Using LLMs [9.630310313347657]
Large language models (LLMs) have demonstrated the ability to generate hardware description language (HDL) code for digital circuits.
LLMs suffer from the hallucination problem, which leads to the generation of incorrect HDL code or misunderstanding of specifications.
We introduce a human-expert-inspired method to mitigate the hallucination of LLMs and improve the performance in HDL code generation.
arXiv Detail & Related papers (2024-07-04T09:00:13Z) - VersiCode: Towards Version-controllable Code Generation [58.82709231906735]
Large Language Models (LLMs) have made tremendous strides in code generation, but existing research fails to account for the dynamic nature of software development.
We propose two novel tasks aimed at bridging this gap: version-specific code completion (VSCC) and version-aware code migration (VACM)
We conduct an extensive evaluation on VersiCode, which reveals that version-controllable code generation is indeed a significant challenge.
arXiv Detail & Related papers (2024-06-11T16:15:06Z) - HDLdebugger: Streamlining HDL debugging with Large Language Models [20.09481664579469]
In the domain of chip design, Hardware Description Languages (HDLs) play a pivotal role.
Despite the strong capabilities of Large Language Models (LLMs) in generating, completing, and inspecting software code, their utilization in the specialized field of HDL debug has been limited.
We propose a framework, namely HDLger, which consists of HDL data generation via a reverse engineering approach, a search engine for retrieval-augmented generation, and a retrieval-augmented LLM fine-tuning approach.
Our experiments, conducted on an HDL code dataset sourced from Huawei, reveal that HDLger outperforms 13 cutting
arXiv Detail & Related papers (2024-03-18T11:19:37Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - VerilogEval: Evaluating Large Language Models for Verilog Code
Generation [6.88526119890374]
We present a comprehensive evaluation dataset consisting of 156 problems from the Verilog instructional website HDLBits.
The evaluation set consists of a diverse set of Verilog code generation tasks, ranging from simple combinational circuits to complex finite state machines.
arXiv Detail & Related papers (2023-09-14T09:15:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.