Related papers: Towards LLM-based Root Cause Analysis of Hardware Design Failures

Towards LLM-based Root Cause Analysis of Hardware Design Failures

URL: http://arxiv.org/abs/2507.06512v1
Date: Wed, 09 Jul 2025 03:25:52 GMT
Title: Towards LLM-based Root Cause Analysis of Hardware Design Failures
Authors: Siyu Qiu, Muzhi Wang, Raheel Afsharmazayejani, Mohammad Moradi Shahmiri, Benjamin Tan, Hammond Pearce,
Abstract summary: Large language models (LLMs) can explain the root cause of design issues and bugs revealed during synthesis and simulation.<n>OpenAI's o3-mini reasoning model reached a correct determination 100% of the time under pass@5 scoring.
Score: 8.588085004917476
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With advances in large language models (LLMs), new opportunities have emerged to develop tools that support the digital hardware design process. In this work, we explore how LLMs can assist with explaining the root cause of design issues and bugs that are revealed during synthesis and simulation, a necessary milestone on the pathway towards widespread use of LLMs in the hardware design process and for hardware security analysis. We find promising results: for our corpus of 34 different buggy scenarios, OpenAI's o3-mini reasoning model reached a correct determination 100% of the time under pass@5 scoring, with other state of the art models and configurations usually achieving more than 80% performance and more than 90% when assisted with retrieval-augmented generation.

Related papers

MeltRTL: Multi-Expert LLMs with Inference-time Intervention for RTL Code Generation [0.0]
MeltRTL is a novel framework that integrates multi-expert attention with inference-time intervention.<n>MeltRTL significantly improves the accuracy of large language models (LLMs) without retraining the base model.<n>We evaluate MeltRTL on the VerilogEval benchmark, achieving 96% synthesizability and 60% functional correctness.
arXiv Detail & Related papers (2026-01-19T12:49:39Z)
Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey [59.3507264893654]
Issue resolution is a complex Software Engineering task integral to real-world development.<n> benchmarks like SWE-bench revealed this task as profoundly difficult for large language models.<n>This paper presents a systematic survey of this emerging domain.
arXiv Detail & Related papers (2026-01-15T18:55:03Z)
A Serverless Architecture for Real-Time Stock Analysis using Large Language Models: An Iterative Development and Debugging Case Study [0.0]
This paper documents the design, implementation, and iterative debug of a novel, serverless system for real-time stock analysis.<n>We detail the architectural evolution of the system, from initial concepts to a robust, event-driven pipeline.<n>The final architecture operates at a near-zero cost, demonstrating a viable model for individuals to build sophisticated AI-powered financial tools.
arXiv Detail & Related papers (2025-07-13T11:29:51Z)
Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study [55.09905978813599]
We evaluate models across three dimensions: data understanding, code generation, and strategic planning.<n>We leverage these insights to develop a data synthesis methodology, demonstrating significant improvements in open-source LLMs' analytical reasoning capabilities.
arXiv Detail & Related papers (2025-06-24T17:04:23Z)
LLM-based AI Agent for Sizing of Analog and Mixed Signal Circuit [2.979579757819132]
Large Language Models (LLMs) have demonstrated significant potential across various fields.<n>In this work, we propose an LLM-based AI agent for AMS circuit design to assist in the sizing process.
arXiv Detail & Related papers (2025-04-14T22:18:16Z)
Integrating Large Language Models for Automated Structural Analysis [0.7373617024876725]
We propose a framework that integrates Large Language Models (LLMs) with structural analysis software.<n>LLMs parse structural descriptions from text and translate them into Python scripts.<n>It employs domain-specific prompt design and in-context learning strategies to enhance the LLM's problem-solving capabilities and generative stability.
arXiv Detail & Related papers (2025-04-13T23:10:33Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
VACT: A Video Automatic Causal Testing System and a Benchmark [55.53300306960048]
VACT is an **automated** framework for modeling, evaluating, and measuring the causal understanding of VGMs in real-world scenarios.<n>We introduce multi-level causal evaluation metrics to provide a detailed analysis of the causal performance of VGMs.
arXiv Detail & Related papers (2025-03-08T10:54:42Z)
AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models [86.83875864328984]
We propose an automated method for synthesizing open-ended logic puzzles, and use it to develop a bilingual benchmark, AutoLogi.<n>Our approach features program-based verification and controllable difficulty levels, enabling more reliable evaluation that better distinguishes models' reasoning abilities.
arXiv Detail & Related papers (2025-02-24T07:02:31Z)
Automatically Improving LLM-based Verilog Generation using EDA Tool Feedback [25.596711210493172]
Large Language Models (LLMs) are emerging as a potential tool to help generate fully functioning HDL code.<n>We evaluate the ability of LLMs to leverage feedback from electronic design automation (EDA) tools to fix mistakes in their own generated Verilog.
arXiv Detail & Related papers (2024-11-01T17:33:28Z)
Designing Algorithms Empowered by Language Models: An Analytical Framework, Case Studies, and Insights [86.06371692309972]
This work presents an analytical framework for the design and analysis of large language models (LLMs)-based algorithms.<n>Our proposed framework serves as an attempt to mitigate such headaches.
arXiv Detail & Related papers (2024-07-20T07:39:07Z)
Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks. However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs. We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z)
LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges. Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model. This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z)
The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models [18.026567399243]
Large Language Models (LLMs) offer a promising alternative to static analysis. In this paper, we take a deep dive into the open space of LLM-assisted static analysis. We develop LLift, a fully automated framework that interfaces with both a static analysis tool and an LLM.
arXiv Detail & Related papers (2023-08-01T02:57:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.