Related papers: Intelligent4DSE: Optimizing High-Level Synthesis Design Space Exploration with Graph Neural Networks and Large Language Models

Intelligent4DSE: Optimizing High-Level Synthesis Design Space Exploration with Graph Neural Networks and Large Language Models

URL: http://arxiv.org/abs/2504.19649v3
Date: Wed, 15 Oct 2025 11:24:32 GMT
Title: Intelligent4DSE: Optimizing High-Level Synthesis Design Space Exploration with Graph Neural Networks and Large Language Models
Authors: Lei Xu, Shanshan Wang, Emmanuel Casseau, Chenglong Xiao,
Abstract summary: We propose ECoGNNs-LLMMHs, a framework that integrates graph neural networks with task-adaptive message passing and large language model-enhanced meta-heuristic algorithms.<n>Compared with state-of-the-art works, ECoGNN exhibits lower prediction error in the post-HLS prediction task, with the error reduced by 57.27%.<n>For post-implementation prediction tasks, ECoGNN demonstrates the lowest prediction errors, with average reductions of 17.6% for flip-flop (FF) usage, 33.7% for critical path (CP)
Score: 6.711674863088882
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: High-Level Synthesis (HLS) Design Space Exploration (DSE) is essential for generating hardware designs that balance performance, power, and area (PPA). To optimize this process, existing works often employs message-passing neural networks (MPNNs) to predict quality of results (QoR). These predictors serve as evaluators in the DSE process, effectively bypassing the time-consuming estimations traditionally required by HLS tools. However, existing models based on MPNNs struggle with over-smoothing and limited expressiveness. Additionally, while meta-heuristic algorithms are widely used in DSE, they typically require extensive domain-specific knowledge to design operators and time-consuming tuning. To address these limitations, we propose ECoGNNs-LLMMHs, a framework that integrates graph neural networks with task-adaptive message passing and large language model-enhanced meta-heuristic algorithms. Compared with state-of-the-art works, ECoGNN exhibits lower prediction error in the post-HLS prediction task, with the error reduced by 57.27\%. For post-implementation prediction tasks, ECoGNN demonstrates the lowest prediction errors, with average reductions of 17.6\% for flip-flop (FF) usage, 33.7\% for critical path (CP) delay, 26.3\% for power consumption, 38.3\% for digital signal processor (DSP) utilization, and 40.8\% for BRAM usage. LLMMH variants can generate superior Pareto fronts compared to meta-heuristic algorithms in terms of average distance from the reference set (ADRS) with average improvements of 87.47\%, respectively. Compared with the SOTA DSE approaches GNN-DSE and IRONMAN-PRO, LLMMH can reduce the ADRS by 68.17\% and 63.07\% respectively.

Related papers

MPM-LLM4DSE: Reaching the Pareto Frontier in HLS with Multimodal Learning and LLM-Driven Exploration [7.33202262448994]
This paper proposes the MPM-LLM4DSE framework, which incorporates a multimodal prediction model (MPM) that fuses behavioral descriptions and control and data flow graphs.<n> Experimental results demonstrate that our multimodal predictive model significantly outperforms state-of-the-art work ProgSG by up to 10.25$times$.
arXiv Detail & Related papers (2026-01-08T10:32:49Z)
Exploring Spiking Neural Networks for Binary Classification in Multivariate Time Series at the Edge [0.9282545044546486]
We present a general framework for training spiking neural networks (SNNs) to perform binary classification on multivariate time series.<n>We apply it to the task of detecting low signal-to-noise ratio radioactive sources in gamma-ray spectral data.<n>The resulting SNNs, with as few as 49 neurons and 66 synapses, achieve a 51.8% true positive rate (TPR) at a false alarm rate of 1/hr.<n> Hardware deployment on the microCaspian neuromorphic platform demonstrates 2mW power consumption and 20.2ms latency.
arXiv Detail & Related papers (2025-10-23T20:52:11Z)
Optimization of DNN-based HSI Segmentation FPGA-based SoC for ADS: A Practical Approach [1.474723404975345]
This work presents a set of optimization techniques for the practical co-design of a DNN-based HSI segmentation processor deployed on a FPGA-based SOC.<n> applied compression techniques significantly reduce the complexity of the designed DNN to 24.34% of the original operations and to 1.02% of the original number of parameters, achieving a 2.86x speed-up in the inference task without noticeable degradation of the segmentation accuracy.
arXiv Detail & Related papers (2025-07-22T13:09:04Z)
Efficient Traffic Classification using HW-NAS: Advanced Analysis and Optimization for Cybersecurity on Resource-Constrained Devices [1.3124513975412255]
This paper presents a hardware-efficient deep neural network (DNN) optimized through hardware-aware neural architecture search (HW-NAS)<n>It supports the classification of session-level encrypted traffic on resource-constrained Internet of Things (IoT) and edge devices.<n>The optimized model attains an accuracy of 96.59% with just 88.26K parameters, 10.08M FLOPs, and a maximum tensor size of 20.12K.
arXiv Detail & Related papers (2025-06-12T21:37:45Z)
Performance Analysis of Convolutional Neural Network By Applying Unconstrained Binary Quadratic Programming [0.0]
Convolutional Neural Networks (CNNs) are pivotal in computer vision and Big Data analytics but demand significant computational resources when trained on large-scale datasets.<n>We propose a hybrid optimization method that combines Unconstrained Binary Quadratic Programming (UBQP) with Gradient Descent (SGD) to accelerate CNN training.<n>Our approach achieves a 10--15% accuracy improvement over a standard BP-CNN baseline while maintaining similar execution times.
arXiv Detail & Related papers (2025-05-30T21:25:31Z)
Efficient training for large-scale optical neural network using an evolutionary strategy and attention pruning [14.20309603187239]
MZI-based block optical neural networks (BONNs) can achieve large-scale network models.<n>We propose an attention-based pruning (CAP) algorithm for large-scale BONNs.<n>Our proposed CAP algorithm show excellent potential for larger-scale network models and more complex tasks.
arXiv Detail & Related papers (2025-05-19T09:41:11Z)
Application-oriented automatic hyperparameter optimization for spiking neural network prototyping [0.0]
This document uses the Neural Network Intelligence (NNI) toolkit as reference framework to present one such solution.<n>A summary of published works employing the presented pipeline is reported as possible source of insights into application-oriented HPO experiments for SNN prototyping.
arXiv Detail & Related papers (2025-02-13T14:49:44Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Joint Pruning and Channel-wise Mixed-Precision Quantization for Efficient Deep Neural Networks [10.229120811024162]
deep neural networks (DNNs) pose significant challenges to their deployment on edge devices. Common approaches to address this issue are pruning and mixed-precision quantization. We propose a novel methodology to apply them jointly via a lightweight gradient-based search.
arXiv Detail & Related papers (2024-07-01T08:07:02Z)
Hybrid-Task Meta-Learning: A GNN Approach for Scalable and Transferable Bandwidth Allocation [50.96751567777229]
We develop a deep learning-based bandwidth allocation policy that is scalable with the number of users and transferable to different communication scenarios.<n>To support scalability, the bandwidth allocation policy is represented by a graph neural network (GNN)<n>We develop a hybrid-task meta-learning (HML) algorithm that trains the initial parameters of the GNN with different communication scenarios.
arXiv Detail & Related papers (2023-12-23T04:25:12Z)
Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.<n>Our approach is capable of encoding neural networks in a model zoo of mixed architecture.<n>We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks. The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z)
Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex. In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z)
Hybrid Graph Models for Logic Optimization via Spatio-Temporal Information [15.850413267830522]
Two major concerns that may impede production-ready ML applications in EDA are accuracy requirements and generalization capability. We propose hybrid graph neural network (GNN) based approaches towards highly accurate quality-of-result (QoR) estimations. Evaluation on 3.3 million data points shows that the testing mean absolute percentage error (MAPE) on designs seen unseen during training are no more than 1.2% and 3.1%.
arXiv Detail & Related papers (2022-01-20T21:12:22Z)
High-Level Synthesis Performance Prediction using GNNs: Benchmarking, Modeling, and Advancing [21.8349113634555]
Agile hardware development requires fast and accurate circuit quality evaluation from early design stages. We propose a rapid and accurate performance modeling, exploiting the representation power of graph neural networks (GNNs) by representing C/C++ programs as graphs. Our proposed predictor largely outperforms HLS by up to 40X and excels existing predictors by 2X to 5X in terms of resource usage and timing prediction.
arXiv Detail & Related papers (2022-01-18T09:53:48Z)
Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames. Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks. We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z)
A Graph Deep Learning Framework for High-Level Synthesis Design Space Exploration [11.154086943903696]
High-Level Synthesis is a solution for fast prototyping application-specific hardware. We propose HLS, for the first time in the literature, graph neural networks that jointly predict acceleration performance and hardware costs. We show that our approach achieves prediction accuracy comparable with that of commonly used simulators.
arXiv Detail & Related papers (2021-11-29T18:17:45Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware. The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation. We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z)
A Meta-Learning Approach to the Optimal Power Flow Problem Under Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach. The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z)
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs) These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training. Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z)
FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking. We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints. FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)
Error-feedback stochastic modeling strategy for time series forecasting with convolutional neural networks [11.162185201961174]
We propose a novel Error-feedback Modeling (ESM) strategy to construct a random Convolutional Network (ESM-CNN) Neural time series forecasting task. The proposed ESM-CNN not only outperforms the state-of-art random neural networks, but also exhibits stronger predictive power and less computing overhead in comparison to trained state-of-art deep neural network models.
arXiv Detail & Related papers (2020-02-03T13:30:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.