Related papers: Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis

Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis

URL: http://arxiv.org/abs/2502.13921v1
Date: Wed, 19 Feb 2025 17:53:59 GMT
Title: Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis
Authors: Jiahao Gai, Hao, Chen, Zhican Wang, Hongyu Zhou, Wanru Zhao, Nicholas Lane, Hongxiang Fan,
Abstract summary: Large language models (LLMs) can be employed for programming languages such as Python and C++.<n>This paper explores leveraging LLMs to generate High-Level Synthesis (HLS)-based hardware design.
Score: 49.998130983414924
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in code generation have illuminated the potential of employing large language models (LLMs) for general-purpose programming languages such as Python and C++, opening new opportunities for automating software development and enhancing programmer productivity. The potential of LLMs in software programming has sparked significant interest in exploring automated hardware generation and automation. Although preliminary endeavors have been made to adopt LLMs in generating hardware description languages (HDLs), several challenges persist in this direction. First, the volume of available HDL training data is substantially smaller compared to that for software programming languages. Second, the pre-trained LLMs, mainly tailored for software code, tend to produce HDL designs that are more error-prone. Third, the generation of HDL requires a significantly higher number of tokens compared to software programming, leading to inefficiencies in cost and energy consumption. To tackle these challenges, this paper explores leveraging LLMs to generate High-Level Synthesis (HLS)-based hardware design. Although code generation for domain-specific programming languages is not new in the literature, we aim to provide experimental results, insights, benchmarks, and evaluation infrastructure to investigate the suitability of HLS over low-level HDLs for LLM-assisted hardware design generation. To achieve this, we first finetune pre-trained models for HLS-based hardware generation, using a collected dataset with text prompts and corresponding reference HLS designs. An LLM-assisted framework is then proposed to automate end-to-end hardware code generation, which also investigates the impact of chain-of-thought and feedback loops promoting techniques on HLS-design generation. Limited by the timeframe of this research, we plan to evaluate more advanced reasoning models in the future.

Related papers

HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design Tasks [4.71707720395444]
We introduce HLS-Eval, the first complete benchmark and evaluation framework for HLS-driven design. The benchmark includes 94 unique designs drawn from standard HLS benchmarks and novel sources. Beyond the benchmark, HLS-Eval offers a modular Python framework for automated, parallel evaluation of both local and hosted LLMs.
arXiv Detail & Related papers (2025-04-16T17:30:36Z)
HADES: Hardware Accelerated Decoding for Efficient Speculation in Large Language Models [1.2180334969164464]
Large Language Models (LLMs) have revolutionized natural language processing by understanding and generating human-like text.<n>This paper introduces Hardware Accelerated Decoding (HADES), a novel approach to enhance the performance and energy efficiency of LLMs.
arXiv Detail & Related papers (2024-12-27T21:19:01Z)
HiVeGen -- Hierarchical LLM-based Verilog Generation for Scalable Chip Design [55.54477725000291]
HiVeGen is a hierarchical Verilog generation framework that decomposes generation tasks into hierarchical submodules.<n> automatic Design Space Exploration (DSE) into hierarchy-aware prompt generation, introducing weight-based retrieval to enhance code reuse.<n>Real-time human-computer interaction to lower error-correction cost, significantly improving the quality of generated designs.
arXiv Detail & Related papers (2024-12-06T19:37:53Z)
Are LLMs Any Good for High-Level Synthesis? [1.3927943269211591]
Large Language Models (LLMs) can streamline or replace the High-Level Synthesis (HLS) process. LLMs can understand natural language specifications and translate C code or natural language specifications. This study aims to illuminate the role of LLMs in HLS, identifying promising directions for optimized hardware design in applications such as AI acceleration, embedded systems, and high-performance computing.
arXiv Detail & Related papers (2024-08-19T21:40:28Z)
Case2Code: Scalable Synthetic Data for Code Generation [105.89741089673575]
Large Language Models (LLMs) have shown outstanding breakthroughs in code generation.<n>Recent work improves code LLMs by training on synthetic data generated by some powerful LLMs.<n>We propose a textbfCase2Code task by exploiting the expressiveness and correctness of programs.
arXiv Detail & Related papers (2024-07-17T11:35:00Z)
CodeV: Empowering LLMs with HDL Generation through Multi-Level Summarization [32.462699328256384]
Traditional methods of adapting large language models for hardware design rely on synthetic HDL datasets.<n>We propose an efficient LLM fine-tuning pipeline for HDL generation that integrates a multi-level summarization data synthesis process with a novel Chat-FIM-Tag supervised fine-tuning method.
arXiv Detail & Related papers (2024-07-15T03:57:20Z)
Digital ASIC Design with Ongoing LLMs: Strategies and Prospects [0.0]
Large Language Models (LLMs) have been seen as a promising development, with the potential to automate the generation of Hardware Description Language (HDL) code. This paper presents targeted strategies to harness the capabilities of LLMs for digital ASIC design.
arXiv Detail & Related papers (2024-04-25T05:16:57Z)
From English to ASIC: Hardware Implementation with Large Language Model [0.210674772139335]
This paper focuses on the fine-tuning of the leading-edge nature language model and the reshuffling of the HDL code dataset. The fine-tuning aims to enhance models' proficiency in generating precise and efficient ASIC design. The dataset reshuffling is intended to broaden the scope and improve the quality of training material.
arXiv Detail & Related papers (2024-03-11T09:57:16Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code) Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z)
LLM4EDA: Emerging Progress in Large Language Models for Electronic Design Automation [74.7163199054881]
Large Language Models (LLMs) have demonstrated their capability in context understanding, logic reasoning and answer generation. We present a systematic study on the application of LLMs in the EDA field. We highlight the future research direction, focusing on applying LLMs in logic synthesis, physical design, multi-modal feature extraction and alignment of circuits.
arXiv Detail & Related papers (2023-12-28T15:09:14Z)
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning. During inference, we introduce a new generation procedure with a critical sampling strategy. For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.