Related papers: RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts

RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts

URL: http://arxiv.org/abs/2503.21971v3
Date: Tue, 10 Jun 2025 22:20:25 GMT
Title: RocketPPA: Code-Level Power, Performance, and Area Prediction via LLM and Mixture of Experts
Authors: Armin Abdollahi, Mehdi Kamal, Massoud Pedram,
Abstract summary: This paper presents RocketPPA, a novel ultra-fast power, performance (delay), and area (PPA) estimator.<n>It operates directly at the code-level abstraction using HDL code as input.<n>It achieves significant improvements in the accuracy of PPA estimation compared to previous state-of-the-art methods.
Score: 4.825037489691159
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents RocketPPA, a novel ultra-fast power, performance (delay), and area (PPA) estimator operating directly at the code-level abstraction using HDL code as input. The key technical innovation is its LLM-based regression model, which uniquely integrates a large language model (LLM) with a mixture-of-experts (MoE) architecture composed of multilayer perceptrons (MLPs). The LLM interprets the input HDL code and then utilizes its final hidden-layer representations to predict PPA metrics. Low-rank adaptation (LoRA) is used for parameter-efficient fine-tuning to enable efficient LLM training. Furthermore, the work includes the development of an LLM-based HDL code repair framework to generate a large and synthesizable training dataset. Experimental results on the VerilogEval benchmark demonstrate that RocketPPA achieves significant improvements in the accuracy of PPA estimation compared to previous state-of-the-art methods like Llama3-MetRex-8B. Specifically, at a 10% relative error threshold, RocketPPA enhances the pass rate for area prediction by 13.6%, delay by 9.4%, and power by 14.7%. At a 20% threshold, the improvements are 9.6% for area, 10.8% for delay, and 18.5% for power. Moreover, RocketPPA achieves a speedup of over 20x compared to MetRex and 30x over MasterRTL in processing the test set. The impact of RocketPPA is the potential to substantially accelerate the hardware design process by providing accurate PPA estimations early in the design cycle, thus avoiding the overhead of manual feature engineering and time-consuming synthesis flows.

Related papers

Machine Learning-Based Basil Yield Prediction in IoT-Enabled Indoor Vertical Hydroponic Farms [0.08388908302793013]
This work explores the integration of indoor vertical hydroponics with Machine Learning (ML) techniques to optimize basil yield while saving water.<n>This research develops a prediction system that uses different ML models and assesses their performance.
arXiv Detail & Related papers (2025-12-15T11:00:34Z)
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs [80.76334908639745]
We propose QeRL, a Quantization-enhanced Reinforcement Learning framework for large language models (LLMs)<n>QeRL addresses issues by combining NVFP4 quantization with Low-Rank Adaptation (LoRA)<n>Experiments demonstrate that QeRL delivers over 1.5 times speedup in the rollout phase.
arXiv Detail & Related papers (2025-10-13T17:55:09Z)
IPA: An Information-Preserving Input Projection Framework for Efficient Foundation Model Adaptation [56.72132739364876]
We propose IPA, a feature-aware projection framework that explicitly preserves information in the reduced hidden space.<n> IPA consistently improves over LoRA and DoRA, achieving on average 1.5 points higher accuracy on commonsense reasoning.
arXiv Detail & Related papers (2025-09-04T17:10:01Z)
VeriOpt: PPA-Aware High-Quality Verilog Generation via Multi-Role LLMs [41.94295877935867]
VeriOpt is a novel framework that leverages role-based prompting and PPA-aware optimization to produce high-quality, synthesizable Verilog.<n>Our work advances the state-of-the-art AI-driven hardware design by addressing the critical gap between correctness and quality.
arXiv Detail & Related papers (2025-07-20T00:28:55Z)
The carbon cost of materials discovery: Can machine learning really accelerate the discovery of new photovoltaics? [0.05524804393257919]
Computational screening has become a powerful complement to experimental efforts in the discovery of high-performance photovoltaic (PV) materials.<n>Most rely on density functional theory (DFT) to estimate electronic and optical properties relevant to solar energy conversion.<n>Machine learning (ML) models have recently gained attention as surrogates for DFT, offering drastic reductions in resource use with competitive predictive performance.
arXiv Detail & Related papers (2025-07-17T15:55:02Z)
EfficientLLM: Efficiency in Large Language Models [64.3537131208038]
Large Language Models (LLMs) have driven significant progress, yet their growing counts and context windows incur prohibitive compute, energy, and monetary costs.<n>We introduce EfficientLLM, a novel benchmark and the first comprehensive empirical study evaluating efficiency techniques for LLMs at scale.
arXiv Detail & Related papers (2025-05-20T02:27:08Z)
GOLLuM: Gaussian Process Optimized LLMs -- Reframing LLM Finetuning through Bayesian Optimization [0.4037357056611557]
Large Language Models (LLMs) can encode complex relationships in their latent spaces. We introduce LLM-based deep kernels, jointly optimized with GPs to preserve the benefits of both. Our method nearly doubles the discovery rate of high-performing reactions compared to static LLM embeddings.
arXiv Detail & Related papers (2025-04-08T17:59:57Z)
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal [55.13854171147104]
Large Language Models (LLMs) have revolutionized various domains, including natural language processing, data analysis, and software development.<n>We present Dynamic Action Re-Sampling (DARS), a novel inference time compute scaling approach for coding agents.<n>We evaluate our approach on SWE-Bench Lite benchmark, demonstrating that this scaling strategy achieves a pass@k score of 55% with Claude 3.5 Sonnet V2.
arXiv Detail & Related papers (2025-03-18T14:02:59Z)
Streaming Looking Ahead with Token-level Self-reward [50.699168440048716]
We propose a policy model with token-level self-reward modeling (TRM) capability to eliminate the need for external models and extra communication.<n>In addition, we propose a streaming-looking-ahead (SLA) algorithm to further boost search efficiency with better parallelization.<n>If we combine SLA with reinforcement fine-tuning techniques such as DPO, SLA achieves an overall win rate of 89.4%.
arXiv Detail & Related papers (2025-02-24T22:35:53Z)
Dynamic Noise Preference Optimization for LLM Self-Improvement via Synthetic Data [51.62162460809116]
We introduce Dynamic Noise Preference Optimization (DNPO) to ensure consistent improvements across iterations. In experiments with Zephyr-7B, DNPO consistently outperforms existing methods, showing an average performance boost of 2.6%. DNPO shows a significant improvement in model-generated data quality, with a 29.4% win-loss rate gap compared to the baseline in GPT-4 evaluations.
arXiv Detail & Related papers (2025-02-08T01:20:09Z)
Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks. Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum. We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z)
Scaling Laws for Predicting Downstream Performance in LLMs [75.28559015477137]
This work focuses on the pre-training loss as a more computation-efficient metric for performance estimation.<n>We present FLP-M, a fundamental approach for performance prediction that addresses the practical need to integrate datasets from multiple sources during pre-training.
arXiv Detail & Related papers (2024-10-11T04:57:48Z)
VinePPO: Refining Credit Assignment in RL Training of LLMs [66.80143024475635]
We propose VinePPO, a straightforward approach that leverages the flexibility of language environments to compute unbiased Monte Carlo-based estimates.<n>Our method consistently outperforms PPO and other baselines across MATH and GSM8K datasets in less wall-clock time.
arXiv Detail & Related papers (2024-10-02T15:49:30Z)
Applying RLAIF for Code Generation with API-usage in Lightweight LLMs [15.366324461797582]
Reinforcement Learning from AI Feedback (RLAIF) has demonstrated significant potential across various domains. This paper introduces an RLAIF framework for improving the code generation abilities of lightweight (1B parameters) LLMs.
arXiv Detail & Related papers (2024-06-28T17:16:03Z)
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs [54.05511925104712]
We propose a simple, effective, and data-efficient method called Step-DPO. Step-DPO treats individual reasoning steps as units for preference optimization rather than evaluating answers holistically. Our findings demonstrate that as few as 10K preference data pairs and fewer than 500 Step-DPO training steps can yield a nearly 3% gain in accuracy on MATH for models with over 70B parameters.
arXiv Detail & Related papers (2024-06-26T17:43:06Z)
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process. We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals. The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z)
LLM Performance Predictors are good initializers for Architecture Search [28.251129134057035]
We construct Performance Predictors (PP) that estimate the performance of specific deep neural network architectures on downstream tasks. In machine translation (MT) tasks, GPT-4 with our PP prompts (LLM-PP) achieves a SoTA mean absolute error and a slight degradation in rank correlation coefficient compared to baseline predictors. For Neural Architecture Search (NAS), we introduce a Hybrid-Search algorithm (HS-NAS) employing LLM-Distill-PP for the initial search stages and reverting to the baseline predictor later.
arXiv Detail & Related papers (2023-10-25T15:34:30Z)
Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF [47.960563851948514]
We investigate an efficient implementation of RLHF using low-rank adaptation (LoRA) Our implementation achieves better performance than the publicly-released AlpacaFarm checkpoint with full model fine-tuning. We release our code and pretrained checkpoints to facilitate future research on more efficient RLHF.
arXiv Detail & Related papers (2023-09-16T17:31:36Z)
Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization [14.23697277904244]
We present Reweighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample re-weighting. We demonstrate the effectiveness of RGD on various learning tasks, including supervised learning, meta-learning, and out-of-domain generalization.
arXiv Detail & Related papers (2023-06-15T15:58:04Z)
A Meta-Learning Approach to Predicting Performance and Data Requirements [163.4412093478316]
We propose an approach to estimate the number of samples required for a model to reach a target performance. We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset. We introduce a novel piecewise power law (PPL) that handles the two data differently.
arXiv Detail & Related papers (2023-03-02T21:48:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.