ChipExpert: The Open-Source Integrated-Circuit-Design-Specific Large Language Model
- URL: http://arxiv.org/abs/2408.00804v1
- Date: Fri, 26 Jul 2024 11:00:08 GMT
- Title: ChipExpert: The Open-Source Integrated-Circuit-Design-Specific Large Language Model
- Authors: Ning Xu, Zhaoyang Zhang, Lei Qi, Wensuo Wang, Chao Zhang, Zihao Ren, Huaiyuan Zhang, Xin Cheng, Yanqi Zhang, Zhichao Liu, Qingwen Wei, Shiyang Wu, Lanlan Yang, Qianfeng Lu, Yiqun Ma, Mengyao Zhao, Junbo Liu, Yufan Song, Xin Geng, Jun Yang,
- Abstract summary: ChipExpert is the first open-source, instructional LLM specifically tailored for the IC design field.
ChipExpert is trained on one of the current best open-source base model (Llama-3 8B)
To mitigate the hallucinations of ChipExpert, we have developed a Retrieval-Augmented Generation (RAG) system.
- Score: 40.91684362807029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The field of integrated circuit (IC) design is highly specialized, presenting significant barriers to entry and research and development challenges. Although large language models (LLMs) have achieved remarkable success in various domains, existing LLMs often fail to meet the specific needs of students, engineers, and researchers. Consequently, the potential of LLMs in the IC design domain remains largely unexplored. To address these issues, we introduce ChipExpert, the first open-source, instructional LLM specifically tailored for the IC design field. ChipExpert is trained on one of the current best open-source base model (Llama-3 8B). The entire training process encompasses several key stages, including data preparation, continue pre-training, instruction-guided supervised fine-tuning, preference alignment, and evaluation. In the data preparation stage, we construct multiple high-quality custom datasets through manual selection and data synthesis techniques. In the subsequent two stages, ChipExpert acquires a vast amount of IC design knowledge and learns how to respond to user queries professionally. ChipExpert also undergoes an alignment phase, using Direct Preference Optimization, to achieve a high standard of ethical performance. Finally, to mitigate the hallucinations of ChipExpert, we have developed a Retrieval-Augmented Generation (RAG) system, based on the IC design knowledge base. We also released the first IC design benchmark ChipICD-Bench, to evaluate the capabilities of LLMs across multiple IC design sub-domains. Through comprehensive experiments conducted on this benchmark, ChipExpert demonstrated a high level of expertise in IC design knowledge Question-and-Answer tasks.
Related papers
- EvolVE: Evolutionary Search for LLM-based Verilog Generation and Optimization [0.2796197251957245]
We present EvolVE, the first framework to analyze multiple evolution strategies on chip design tasks.<n>We also introduce IC-RTL, targeting industry-scale problems derived from the National Integrated Circuit Contest.
arXiv Detail & Related papers (2026-01-26T01:53:54Z) - ChipMind: Retrieval-Augmented Reasoning for Long-Context Circuit Design Specifications [22.508372519635543]
We introduce ChipMind, a knowledge graph-augmented reasoning framework specifically designed for lengthy IC specifications.<n>ChipMind first transforms circuit specifications into a domain-specific knowledge graph ChipKG through the Circuit Semantic-Aware Knowledge Graph Construction methodology.<n>It then leverages the ChipKG-Augmented Reasoning mechanism, combining information-theoretic adaptive retrieval to dynamically trace logical dependencies with intent-aware semantic filtering to prune irrelevant noise.
arXiv Detail & Related papers (2025-12-05T02:09:49Z) - Benchmarking Chinese Commonsense Reasoning with a Multi-hop Reasoning Perspective [53.594353527056775]
We propose Chinese Commonsense Multi-hop Reasoning ( CCMOR) to evaluate Large Language Models (LLMs)<n> CCMOR is designed to evaluate LLMs' ability to integrate Chinese-specific factual knowledge with multi-step logical reasoning.<n>We implement a human-in-the-loop verification system, where domain experts systematically validate and refine the generated questions.
arXiv Detail & Related papers (2025-10-09T20:29:00Z) - A Survey on Code Generation with LLM-based Agents [61.474191493322415]
Code generation agents powered by large language models (LLMs) are revolutionizing the software development paradigm.<n>LLMs are characterized by three core features.<n>This paper presents a systematic survey of the field of LLM-based code generation agents.
arXiv Detail & Related papers (2025-07-31T18:17:36Z) - MMCircuitEval: A Comprehensive Multimodal Circuit-Focused Benchmark for Evaluating LLMs [25.945493464645548]
multimodal large language models (MLLMs) present promising opportunities for automation and enhancement in Electronic Design Automation (EDA)<n>We introduce MMCircuitEval, the first multimodal benchmark specifically designed to assess MLLM performance across diverse EDA tasks.<n> MMCircuitEval comprises 3614 meticulously curated question-answer (QA) pairs spanning digital and analog circuits across critical EDA stages.
arXiv Detail & Related papers (2025-07-20T05:46:32Z) - LLM-based AI Agent for Sizing of Analog and Mixed Signal Circuit [2.979579757819132]
Large Language Models (LLMs) have demonstrated significant potential across various fields.
In this work, we propose an LLM-based AI agent for AMS circuit design to assist in the sizing process.
arXiv Detail & Related papers (2025-04-14T22:18:16Z) - ECM: A Unified Electronic Circuit Model for Explaining the Emergence of In-Context Learning and Chain-of-Thought in Large Language Model [64.22300168242221]
In-Context Learning (ICL) and Chain-of-Thought (CoT) are emerging capabilities in large language models.
We propose the Electronic Circuit Model (ECM) to better understand ICL and CoT.
We show that ECM effectively predicts and explains LLM performance across a variety of prompting strategies.
arXiv Detail & Related papers (2025-02-05T16:22:33Z) - ChipAlign: Instruction Alignment in Large Language Models for Chip Design via Geodesic Interpolation [7.660954005766763]
ChipAlign combines the strengths of a general instruction-aligned LLM with a chip-specific LLM.
ChipAlign significantly enhances instruction-following capabilities of existing chip LLMs.
arXiv Detail & Related papers (2024-12-15T04:21:24Z) - AMSnet-KG: A Netlist Dataset for LLM-based AMS Circuit Auto-Design Using Knowledge Graph RAG [15.61553255884534]
Large language models (LLMs) have emerged as powerful tools for Electronic Design Automation (EDA) applications.
This paper introduces AMSnet-KG, a dataset encompassing various AMS circuit schematics and netlists.
We propose an automated AMS circuit generation framework that utilizes the comprehensive knowledge embedded in LLMs.
arXiv Detail & Related papers (2024-11-07T02:49:53Z) - Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms [77.71341200638416]
ChiPBench is a benchmark designed to evaluate the effectiveness of AI-based chip placement algorithms.
We have gathered 20 circuits from various domains (e.g., CPU, GPU, and microcontrollers) for evaluation.
Results show that even if intermediate metric of a single-point algorithm is dominant, the final PPA results are unsatisfactory.
arXiv Detail & Related papers (2024-07-03T03:29:23Z) - Large Language Model Agent as a Mechanical Designer [7.136205674624813]
In this study, we present a novel approach that integrates pre-trained LLMs with a FEM module.
The FEM module evaluates each design and provides essential feedback, guiding the LLMs to continuously learn, plan, generate, and optimize designs without the need for domain-specific training.
Our results reveal that these LLM-based agents can successfully generate truss designs that comply with natural language specifications with a success rate of up to 90%, which varies according to the applied constraints.
arXiv Detail & Related papers (2024-04-26T16:41:24Z) - LLM4EDA: Emerging Progress in Large Language Models for Electronic
Design Automation [74.7163199054881]
Large Language Models (LLMs) have demonstrated their capability in context understanding, logic reasoning and answer generation.
We present a systematic study on the application of LLMs in the EDA field.
We highlight the future research direction, focusing on applying LLMs in logic synthesis, physical design, multi-modal feature extraction and alignment of circuits.
arXiv Detail & Related papers (2023-12-28T15:09:14Z) - EDALearn: A Comprehensive RTL-to-Signoff EDA Benchmark for Democratized
and Reproducible ML for EDA Research [5.093676641214663]
We introduce EDALearn, the first holistic, open-source benchmark suite specifically for Machine Learning tasks in EDA.
This benchmark suite presents an end-to-end flow from synthesis to physical implementation, enriching data collection across various stages.
Our contributions aim to encourage further advances in the ML-EDA domain.
arXiv Detail & Related papers (2023-12-04T06:51:46Z) - Vision-Language Instruction Tuning: A Review and Analysis [52.218690619616474]
Vision-Language Instruction Tuning (VLIT) presents more complex characteristics compared to pure text instruction tuning.
We offer a detailed categorization for existing VLIT datasets and identify the characteristics that high-quality VLIT data should possess.
By incorporating these characteristics as guiding principles into the existing VLIT data construction process, we conduct extensive experiments and verify their positive impact on the performance of tuned multi-modal LLMs.
arXiv Detail & Related papers (2023-11-14T14:02:32Z) - Iterative Forward Tuning Boosts In-Context Learning in Language Models [88.25013390669845]
In this study, we introduce a novel two-stage framework to boost in-context learning in large language models (LLMs)
Specifically, our framework delineates the ICL process into two distinct stages: Deep-Thinking and test stages.
The Deep-Thinking stage incorporates a unique attention mechanism, i.e., iterative enhanced attention, which enables multiple rounds of information accumulation.
arXiv Detail & Related papers (2023-05-22T13:18:17Z) - On Joint Learning for Solving Placement and Routing in Chip Design [70.30640973026415]
We propose a joint learning method by DeepPlace for the placement of macros and standard cells.
We also develop a joint learning approach via reinforcement learning to fulfill both macro placement and routing, which is called DeepPR.
Our method can effectively learn from experience and also provides intermediate placement for the post standard cell placement, within few hours for training.
arXiv Detail & Related papers (2021-10-30T11:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.