Related papers: ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols

ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols

URL: http://arxiv.org/abs/2506.07945v1
Date: Mon, 09 Jun 2025 17:10:47 GMT
Title: ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols
Authors: Arnav Sheth, Ivaxi Sheth, Mario Fritz,
Abstract summary: Large Language Models (LLMs) have shown promising capabilities in generating code for general-purpose programming languages.<n>SystemVerilogs are logic-oriented and demand strict adherence to timing, semantics, and synthesizability constraints.<n>This paper introduces the first benchmark suite targeting four widely used protocols: I2C, This, IC, and AXI.
Score: 45.66401695351214
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in Large Language Models (LLMs) have shown promising capabilities in generating code for general-purpose programming languages. In contrast, their applicability for hardware description languages, particularly for generating synthesizable and functionally correct designs, remains significantly underexplored. HDLs such as SystemVerilog are logic-oriented and demand strict adherence to timing semantics, concurrency, and synthesizability constraints. Moreover, HDL-based design flows encompass a broad set of tasks beyond structural code generation, including testbench development, assertion-based verification, timing closure, and protocol-level integration for on-chip communication. The objective of our paper is to analyze the capabilities of state-of-the-art LLMs in generating SystemVerilog implementations of standard communication protocols, a core component of embedded and System-on-Chip (SoC) architectures. This paper introduces the first benchmark suite targeting four widely used protocols: SPI, I2C, UART, and AXI. We define code generation tasks that capture varying levels of design abstraction and prompt specificity. The generated designs are assessed for syntactic correctness, synthesizability, and functional fidelity via waveform simulation and test benches.

Related papers

LLM-Assisted Model-Based Fuzzing of Protocol Implementations [9.512044399020514]
Faults in protocol behavior can lead to vulnerabilities and system failures.<n>A common approach to protocol testing involves constructing Markovian models that capture the state transitions and expected behaviors of the protocol.<n>We propose a novel method that leverages large language models (LLMs) to automatically generate sequences for testing network protocol implementations.
arXiv Detail & Related papers (2025-08-03T13:16:18Z)
IFEvalCode: Controlled Code Generation [69.28317223249358]
The paper introduces forward and backward constraints generation to improve the instruction-following capabilities of Code LLMs.<n>The authors present IFEvalCode, a multilingual benchmark comprising 1.6K test samples across seven programming languages.
arXiv Detail & Related papers (2025-07-30T08:08:48Z)
DecoRTL: A Run-time Decoding Framework for RTL Code Generation with LLMs [0.0]
We show that large language models (LLMs) exhibit low confidence in regions of structural ambiguity or semantic complexity.<n>We introduce DecoRTL, a novel run-time decoding strategy, that is both syntax-aware and contrastive for RTL code generation.<n>Our approach operates entirely at inference time without requiring any additional model fine-tuning.
arXiv Detail & Related papers (2025-07-03T01:17:44Z)
Execution Guided Line-by-Line Code Generation [49.1574468325115]
We present a novel approach to neural code generation that incorporates real-time execution signals into the language model generation process.<n>Our method, ExecutionGuidedFree Guidance (EGCFG), incorporates execution signals as model generates code.
arXiv Detail & Related papers (2025-06-12T17:50:05Z)
Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z)
SIMCOPILOT: Evaluating Large Language Models for Copilot-Style Code Generation [5.880496520248658]
SIMCOPILOT is a benchmark that simulates the role of large language models (LLMs) as interactive, "copilot"-style coding assistants.<n>The benchmark comprises dedicated sub-benchmarks for Java (SIMCOPILOTJ) and Python.
arXiv Detail & Related papers (2025-05-21T04:59:44Z)
ComplexVCoder: An LLM-Driven Framework for Systematic Generation of Complex Verilog Code [9.68747119462712]
We present ComplexVCoder, an open-source framework that enhances the generation quality and efficiency of complex Verilog code.<n>Specifically, we introduce a two-stage generation mechanism, which leverages an intermediate representation to enable a more structured transition from natural language descriptions to intricate Verilog designs.<n>In addition, we introduce a rule-based alignment method and a domain-specific retrieval-augmented generation (RAG) to further improve the correctness of the synthesized code.
arXiv Detail & Related papers (2025-04-29T11:22:06Z)
VeriMind: Agentic LLM for Automated Verilog Generation with a Novel Evaluation Metric [4.590930025882158]
We propose VeriMind, an agentic LLM framework for Verilog code generation.<n>We introduce a novel evaluation metric-pass@ARC-which combines the conventional pass@k measure with Average Refinement Cycles (ARC) to capture both success rate and the efficiency of iterative refinement.<n> Experimental results on diverse hardware design tasks demonstrated that our approach achieved up to $8.3%$ improvement on pass@k metric and $8.1%$ on pass@ARC metric.
arXiv Detail & Related papers (2025-03-15T23:43:06Z)
Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis [14.458529723566379]
Large language models (LLMs) can be employed for programming languages such as Python and C++.<n>This paper explores leveraging LLMs to generate High-Level Synthesis (HLS)-based hardware design.
arXiv Detail & Related papers (2025-02-19T17:53:59Z)
EpiCoder: Encompassing Diversity and Complexity in Code Generation [49.170195362149386]
Existing methods for code generation use code snippets as seed data.<n>We introduce a novel feature tree-based synthesis framework, which revolves around hierarchical code features.<n>Our framework provides precise control over the complexity of the generated code, enabling functionalities that range from function-level operations to multi-file scenarios.
arXiv Detail & Related papers (2025-01-08T18:58:15Z)
HiVeGen -- Hierarchical LLM-based Verilog Generation for Scalable Chip Design [55.54477725000291]
HiVeGen is a hierarchical Verilog generation framework that decomposes generation tasks into hierarchical submodules.<n> automatic Design Space Exploration (DSE) into hierarchy-aware prompt generation, introducing weight-based retrieval to enhance code reuse.<n>Real-time human-computer interaction to lower error-correction cost, significantly improving the quality of generated designs.
arXiv Detail & Related papers (2024-12-06T19:37:53Z)
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes. CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks. It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z)
Towards Auto-Modeling of Formal Verification for NextG Protocols: A Multimodal cross- and self-attention Large Language Model Approach [3.9155346446573502]
This paper introduces Auto-modeling of Formal Verification with Real-world Prompting for 5G and NextG protocols (AVRE) AVRE is a novel system designed for the formal verification of Next Generation (NextG) communication protocols.
arXiv Detail & Related papers (2023-12-28T20:41:24Z)
Towards Semantic Communication Protocols: A Probabilistic Logic Perspective [69.68769942563812]
We propose a semantic protocol model (SPM) constructed by transforming an NPM into an interpretable symbolic graph written in the probabilistic logic programming language (ProbLog) By leveraging its interpretability and memory-efficiency, we demonstrate several applications such as SPM reconfiguration for collision-avoidance.
arXiv Detail & Related papers (2022-07-08T14:19:36Z)
Synthetic Datasets for Neural Program Synthesis [66.20924952964117]
We propose a new methodology for controlling and evaluating the bias of synthetic data distributions over both programs and specifications. We demonstrate, using the Karel DSL and a small Calculator DSL, that training deep networks on these distributions leads to improved cross-distribution generalization performance.
arXiv Detail & Related papers (2019-12-27T21:28:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.