Related papers: Automatic High-quality Verilog Assertion Generation through Subtask-Focused Fine-Tuned LLMs and Iterative Prompting

Automatic High-quality Verilog Assertion Generation through Subtask-Focused Fine-Tuned LLMs and Iterative Prompting

URL: http://arxiv.org/abs/2411.15442v1
Date: Sat, 23 Nov 2024 03:52:32 GMT
Title: Automatic High-quality Verilog Assertion Generation through Subtask-Focused Fine-Tuned LLMs and Iterative Prompting
Authors: Mohammad Shahidzadeh, Behnam Ghavami, Steve Wilton, Lesley Shannon,
Abstract summary: We present a large language model (LLM) -based flow to automatically generate high-quality SystemVerilog Assertions (SVA) We introduce a novel sub-task-focused fine-tuning approach, leading to a remarkable 7.3-fold increase in the number of functionally correct assertions. Experiments demonstrate a 26% increase in the number of assertions free from syntax errors using this approach.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Formal Property Verification (FPV), using SystemVerilog Assertions (SVA), is crucial for ensuring the completeness of design with respect to the specification. However, writing SVA is a laborious task and has a steep learning curve. In this work, we present a large language model (LLM) -based flow to automatically generate high-quality SVA from the design specification documents, named \ToolName. We introduce a novel sub-task-focused fine-tuning approach that effectively addresses functionally incorrect assertions produced by baseline LLMs, leading to a remarkable 7.3-fold increase in the number of functionally correct assertions. Recognizing the prevalence of syntax and semantic errors, we also developed an iterative refinement method that enhances the LLM's initial outputs by systematically re-prompting it to correct identified issues. This process is further strengthened by a custom compiler that generates meaningful error messages, guiding the LLM towards improved accuracy. The experiments demonstrate a 26\% increase in the number of assertions free from syntax errors using this approach, showcasing its potential to streamline the FPV process.

Related papers

INPROVF: Leveraging Large Language Models to Repair High-level Robot Controllers from Assumption Violations [33.6861334936808]
INPROVF is an automatic framework that combines large language models (LLMs) and formal methods to speed up the repair process of high-level robot controllers.
arXiv Detail & Related papers (2025-03-17T19:08:36Z)
LLM-AutoDiff: Auto-Differentiate Any LLM Workflow [58.56731133392544]
We introduce LLM-AutoDiff: a novel framework for Automatic Prompt Engineering (APE) LLMs-AutoDiff treats each textual input as a trainable parameter and uses a frozen backward engine to generate feedback-akin to textual gradients. It consistently outperforms existing textual gradient baselines in both accuracy and training cost.
arXiv Detail & Related papers (2025-01-28T03:18:48Z)
Real-time Verification and Refinement of Language Model Text Generation [60.04718679054704]
Large language models (LLMs) have shown remarkable performance across a wide range of natural language tasks. A critical challenge remains in that they sometimes generate factually incorrect answers. We propose Streaming-VR, a novel approach designed to enhance the efficiency of verification and refinement of LLM outputs.
arXiv Detail & Related papers (2025-01-14T03:59:48Z)
Eliciting Causal Abilities in Large Language Models for Reasoning Tasks [14.512834333917414]
We introduce the Self-Causal Instruction Enhancement (SCIE) method, which enables LLMs to generate high-quality, low-quantity observational data. In SCIE, the instructions are treated as the treatment, and textual features are used to process natural language. Our method effectively generates instructions that enhance reasoning performance with reduced training cost of prompts.
arXiv Detail & Related papers (2024-12-19T17:03:02Z)
FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware [4.480157114854711]
We present FVEval, the first comprehensive benchmark for characterizing large language models (LLMs) performance in tasks pertaining to formal verification (FV) The benchmark consists of three sub-tasks that measure LLM capabilities at different levels. We present both collections of expert-written verification collateral and methodologies to scalably generate synthetic examples aligned with FV.
arXiv Detail & Related papers (2024-10-15T21:48:57Z)
Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning. LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors. We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z)
AIvril: AI-Driven RTL Generation With Verification In-The-Loop [0.7831852829409273]
Large Language Models (LLMs) are computational models capable of performing complex natural language processing tasks. This paper introduces AIvril, a framework designed to enhance the accuracy and reliability of RTL-aware LLMs.
arXiv Detail & Related papers (2024-09-03T15:07:11Z)
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
AssertionBench: A Benchmark to Evaluate Large-Language Models for Assertion Generation [6.3585378855805725]
We present a novel benchmark to evaluate Large-Language Models' effectiveness for assertion generation. AssertioBench contains 100 curated Verilog hardware designs from OpenCores and formally verified assertions for each design generated from GoldMine and HARM.
arXiv Detail & Related papers (2024-06-26T14:47:28Z)
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models [54.14602121129874]
We introduce AutoIF, the first scalable and reliable method for automatically generating instruction-following training data. AutoIF transforms the validation of instruction-following data quality into code verification.
arXiv Detail & Related papers (2024-06-19T13:29:53Z)
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation [22.124234811959532]
Large language models (LLMs) exhibit significant drawbacks when processing long contexts. We propose a novel RAG prompting methodology, which can be directly applied to pre-trained transformer-based LLMs. We demonstrate the capability of our method to simultaneously enhance time efficiency across a variety of question-answering benchmarks.
arXiv Detail & Related papers (2024-04-10T11:03:17Z)
ChIRAAG: ChatGPT Informed Rapid and Automated Assertion Generation [10.503097140635374]
ChIRAAG, based on OpenAI GPT4, generates System Verilog Assertion (SVA) from natural language specifications of a design. In experiments, only 27% of LLM-generated raw assertions had errors, which was rectified in few iterations. Our results show that LLMs can streamline and assist engineers in the assertion generation process, reshaping verification.
arXiv Detail & Related papers (2024-01-31T12:41:27Z)
Prompt Optimization via Adversarial In-Context Learning [51.18075178593142]
adv-ICL is implemented as a two-player game between a generator and a discriminator. The generator tries to generate realistic enough output to fool the discriminator. We show that adv-ICL results in significant improvements over state-of-the-art prompt optimization techniques.
arXiv Detail & Related papers (2023-12-05T09:44:45Z)
Instruction Position Matters in Sequence Generation with Large Language Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization. We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z)
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs) We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.