Related papers: All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification

All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification

URL: http://arxiv.org/abs/2403.16750v1
Date: Mon, 25 Mar 2024 13:23:24 GMT
Title: All Artificial, Less Intelligence: GenAI through the Lens of Formal Verification
Authors: Deepak Narayan Gadde, Aman Kumar, Thomas Nalapat, Evgenii Rezunov, Fabio Cappellini,
Abstract summary: This paper focuses on the formal verification of Common Weaknessions (CWEs) in modern hardware designs. We apply formal verification to categorize each hardware design as vulnerable or CWE-free. We have associated the identified vulnerabilities with CWE numbers for a dataset of 60,000 generated SystemVerilog Register Transfer Level (RTL) code.
Score: 2.015768713390138
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern hardware designs have grown increasingly efficient and complex. However, they are often susceptible to Common Weakness Enumerations (CWEs). This paper is focused on the formal verification of CWEs in a dataset of hardware designs written in SystemVerilog from Regenerative Artificial Intelligence (AI) powered by Large Language Models (LLMs). We applied formal verification to categorize each hardware design as vulnerable or CWE-free. This dataset was generated by 4 different LLMs and features a unique set of designs for each of the 10 CWEs we target in our paper. We have associated the identified vulnerabilities with CWE numbers for a dataset of 60,000 generated SystemVerilog Register Transfer Level (RTL) code. It was also found that most LLMs are not aware of any hardware CWEs; hence they are usually not considered when generating the hardware code. Our study reveals that approximately 60% of the hardware designs generated by LLMs are prone to CWEs, posing potential safety and security risks. The dataset could be ideal for training LLMs and Machine Learning (ML) algorithms to abstain from generating CWE-prone hardware designs.

Related papers

SALAD: Systematic Assessment of Machine Unlearning on LLM-Aided Hardware Design [15.02451323284475]
Large Language Models (LLMs) offer transformative capabilities for hardware design automation.<n>LLMs pose significant data security challenges, including Verilog evaluation data contamination, intellectual property (IP) design leakage, and the risk of malicious Verilog generation.<n>We introduce SALAD, a comprehensive assessment that leverages machine unlearning to mitigate these threats.
arXiv Detail & Related papers (2025-06-02T13:59:08Z)
Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis [14.458529723566379]
Large language models (LLMs) can be employed for programming languages such as Python and C++. This paper explores leveraging LLMs to generate High-Level Synthesis (HLS)-based hardware design.
arXiv Detail & Related papers (2025-02-19T17:53:59Z)
HiVeGen -- Hierarchical LLM-based Verilog Generation for Scalable Chip Design [55.54477725000291]
HiVeGen is a hierarchical Verilog generation framework that decomposes generation tasks into hierarchical submodules. automatic Design Space Exploration (DSE) into hierarchy-aware prompt generation, introducing weight-based retrieval to enhance code reuse. Real-time human-computer interaction to lower error-correction cost, significantly improving the quality of generated designs.
arXiv Detail & Related papers (2024-12-06T19:37:53Z)
C2HLSC: Leveraging Large Language Models to Bridge the Software-to-Hardware Design Gap [20.528640971583112]
This paper investigates Large Language Models for automatically C code into HLS-compatible formats. We present a case study using an LLM to rewrite C code for NIST 800-22 tests randomness, a QuickSort algorithm, and AES-128 into HLS-synthesizable C.
arXiv Detail & Related papers (2024-11-29T19:22:52Z)
RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code Generation [17.53405545690049]
Large language models (LLMs) have demonstrated remarkable potential with code generation/completion tasks for hardware design. LLMs are susceptible to so-called data poisoning or backdoor attacks. Here, attackers inject malicious code for the training data, which can be carried over into the HDL code generated by LLMs.
arXiv Detail & Related papers (2024-11-26T16:31:18Z)
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems. While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited. We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z)
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data [60.75578581719921]
Large language models (LLMs) have shown great potential for automatic code generation. Recent studies highlight that many LLM-generated code contains serious security vulnerabilities. We introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes.
arXiv Detail & Related papers (2024-09-10T12:01:43Z)
An Exploratory Study on Fine-Tuning Large Language Models for Secure Code Generation [17.69409515806874]
We present an exploratory study on whether fine-tuning pre-trained LLMs on datasets of vulnerability-fixing commits can promote secure code generation. We crawled a fine-tuning dataset for secure code generation by collecting code fixes of confirmed vulnerabilities from open-source repositories. Our exploration reveals that fine-tuning LLMs can improve secure code generation by 6.4% in C language and 5.4% in C++ language.
arXiv Detail & Related papers (2024-08-17T02:51:27Z)
Harnessing the Power of LLMs in Source Code Vulnerability Detection [0.0]
Software vulnerabilities, caused by unintentional flaws in source code, are a primary root cause of cyberattacks. We harness Large Language Models' capabilities to analyze source code and detect known vulnerabilities.
arXiv Detail & Related papers (2024-08-07T00:48:49Z)
Can We Trust Large Language Models Generated Code? A Framework for In-Context Learning, Security Patterns, and Code Evaluations Across Diverse LLMs [2.7138982369416866]
Large Language Models (LLMs) have revolutionized automated code generation in software engineering. However, concerns have arisen regarding the security and quality of the generated code. Our research aims to tackle these issues by introducing a framework for secure behavioral learning of LLMs.
arXiv Detail & Related papers (2024-06-18T11:29:34Z)
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit [55.73370804397226]
Quantization, a key compression technique, can effectively mitigate these demands by compressing and accelerating large language models. We present LLMC, a plug-and-play compression toolkit, to fairly and systematically explore the impact of quantization. Powered by this versatile toolkit, our benchmark covers three key aspects: calibration data, algorithms (three strategies), and data formats.
arXiv Detail & Related papers (2024-05-09T11:49:05Z)
CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code [56.019447113206006]
Large Language Models (LLMs) have achieved remarkable progress in code generation. CodeIP is a novel multi-bit watermarking technique that embeds additional information to preserve provenance details. Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP.
arXiv Detail & Related papers (2024-04-24T04:25:04Z)
LLM4SecHW: Leveraging Domain Specific Large Language Model for Hardware Debugging [4.297043877989406]
This paper presents a novel framework for hardware debug that leverages domain specific Large Language Model (LLM) We propose a unique approach to compile a dataset of open source hardware design defects and their remediation steps. LLM4SecHW employs fine tuning of medium sized LLMs based on this dataset, enabling the identification and rectification of bugs in hardware designs.
arXiv Detail & Related papers (2024-01-28T19:45:25Z)
LLM4EDA: Emerging Progress in Large Language Models for Electronic Design Automation [74.7163199054881]
Large Language Models (LLMs) have demonstrated their capability in context understanding, logic reasoning and answer generation. We present a systematic study on the application of LLMs in the EDA field. We highlight the future research direction, focusing on applying LLMs in logic synthesis, physical design, multi-modal feature extraction and alignment of circuits.
arXiv Detail & Related papers (2023-12-28T15:09:14Z)
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM [72.1638273937025]
We present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence. Our library supports a collection of pretrained Code LLM models and popular code benchmarks. We hope CodeTF is able to bridge the gap between machine learning/generative AI and software engineering.
arXiv Detail & Related papers (2023-05-31T05:24:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.