Related papers: Multi-objective Optimization in CPU Design Space Exploration: Attention is All You Need

Multi-objective Optimization in CPU Design Space Exploration: Attention is All You Need

URL: http://arxiv.org/abs/2410.18368v2
Date: Thu, 14 Aug 2025 03:32:45 GMT
Title: Multi-objective Optimization in CPU Design Space Exploration: Attention is All You Need
Authors: Runzhen Xue, Hao Wu, Mingyu Yan, Ziheng Xiao, Guangyu Sun, Xiaochun Ye, Dongrui Fan,
Abstract summary: Design Space Exploration (DSE) is essential to modern CPU design, yet current frameworks struggle to scale and generalize in high-dimensional architectural spaces.<n>We present textbfAttentionDSE, the first end-to-end DSE framework that integrates performance prediction and neurally design guidance through an attention-based architecture.<n>Key innovations include a textbfPerception-Driven Attention mechanism that exploits architectural hierarchy and locality, scaling attention complexity from $mathcalO(n2)$ to $mathcalO(n)$ via sliding windows
Score: 11.931035726174906
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Design Space Exploration (DSE) is essential to modern CPU design, yet current frameworks struggle to scale and generalize in high-dimensional architectural spaces. As the dimensionality of design spaces continues to grow, existing DSE frameworks face three fundamental challenges: (1) reduced accuracy and poor scalability of surrogate models in large design spaces; (2) inefficient acquisition guided by hand-crafted heuristics or exhaustive search; (3) limited interpretability, making it hard to pinpoint architectural bottlenecks. In this work, we present \textbf{AttentionDSE}, the first end-to-end DSE framework that \emph{natively integrates} performance prediction and design guidance through an attention-based neural architecture. Unlike traditional DSE workflows that separate surrogate modeling from acquisition and rely heavily on hand-crafted heuristics, AttentionDSE establishes a unified, learning-driven optimization loop, in which attention weights serve a dual role: enabling accurate performance estimation and simultaneously exposing the performance bottleneck. This paradigm shift elevates attention from a passive representation mechanism to an active, interpretable driver of design decision-making. Key innovations include: (1) a \textbf{Perception-Driven Attention} mechanism that exploits architectural hierarchy and locality, scaling attention complexity from $\mathcal{O}(n^2)$ to $\mathcal{O}(n)$ via sliding windows; (2) an \textbf{Attention-aware Bottleneck Analysis} that automatically surfaces critical parameters for targeted optimization, eliminating the need for domain-specific heuristics. Evaluated on high-dimensional CPU design space using the SPEC CPU2017 benchmark suite, AttentionDSE achieves up to \textbf{3.9\% higher Pareto Hypervolume} and over \textbf{80\% reduction in exploration time} compared to state-of-the-art baselines.

Related papers

SSR: Pushing the Limit of Spatial Intelligence with Structured Scene Reasoning [30.87517633729756]
SSR is a framework designed for Structured Scene Reasoning.<n>It seamlessly integrates 2D and 3D representations via a lightweight alignment mechanism.<n>It achieves state-of-the-art performance on multiple spatial intelligence benchmarks.
arXiv Detail & Related papers (2026-02-28T02:05:35Z)
A General Neural Backbone for Mixed-Integer Linear Optimization via Dual Attention [33.27281529953169]
Mixed-integer linear programming (MILP) is a widely used modeling framework for optimization.<n>Recent advances in deep learning address this challenge by representing MILP instances as variable-constraint bipartite graphs.<n>We present an attention-driven neural architecture that learns expressive representations beyond the pure graph view.
arXiv Detail & Related papers (2026-01-08T02:23:47Z)
The Curious Case of In-Training Compression of State Space Models [49.819321766705514]
State Space Models (SSMs) tackle long sequence modeling tasks efficiently, offer both parallelizable training and fast inference.<n>Key design challenge is striking the right balance between maximizing expressivity and limiting this computational burden.<n>Our approach, textscCompreSSM, applies to Linear Time-Invariant SSMs such as Linear Recurrent Units, but is also extendable to selective models.
arXiv Detail & Related papers (2025-10-03T09:02:33Z)
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention [54.15345846343084]
We propose Ultra3D, an efficient 3D generation framework that significantly accelerates sparse voxel modeling without compromising quality.<n>Part Attention is a geometry-aware localized attention mechanism that restricts attention computation within semantically consistent part regions.<n>Experiments demonstrate that Ultra3D supports high-resolution 3D generation at 1024 resolution and achieves state-of-the-art performance in both visual fidelity and user preference.
arXiv Detail & Related papers (2025-07-23T17:57:16Z)
Optimization of DNN-based HSI Segmentation FPGA-based SoC for ADS: A Practical Approach [1.474723404975345]
This work presents a set of optimization techniques for the practical co-design of a DNN-based HSI segmentation processor deployed on a FPGA-based SOC.<n> applied compression techniques significantly reduce the complexity of the designed DNN to 24.34% of the original operations and to 1.02% of the original number of parameters, achieving a 2.86x speed-up in the inference task without noticeable degradation of the segmentation accuracy.
arXiv Detail & Related papers (2025-07-22T13:09:04Z)
Offline Model-Based Optimization: Comprehensive Review [61.91350077539443]
offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only offline datasets. Recent advances in model-based optimization have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review.
arXiv Detail & Related papers (2025-03-21T16:35:02Z)
Advanced Chain-of-Thought Reasoning for Parameter Extraction from Documents Using Large Language Models [3.7324910012003656]
Current methods struggle to handle high-dimensional design data and meet the demands of real-time processing. We propose an innovative framework that automates the extraction of parameters and the generation of PySpice models. Experimental results show that applying all three methods together improves retrieval precision by 47.69% and reduces processing latency by 37.84%.
arXiv Detail & Related papers (2025-02-23T11:19:44Z)
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention [32.48360534726024]
We present NSA, a Natively trainable Sparse Attention mechanism that integrates algorithmic innovations with hardware-aligned optimizations.<n>NSA employs a dynamic hierarchical sparse strategy, combining coarse-grained token compression with fine-grained token selection to preserve both global context awareness and local precision.
arXiv Detail & Related papers (2025-02-16T11:53:44Z)
AIRCHITECT v2: Learning the Hardware Accelerator Design Space through Unified Representations [3.6231171463908938]
Design space exploration plays a crucial role in enabling custom hardware architectures. Recently, AIrchitect v1, the first attempt to address the limitations of DSE into a search-time classification problem.
arXiv Detail & Related papers (2025-01-17T04:57:42Z)
A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation. deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency. This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Efficient High-Resolution Visual Representation Learning with State Space Model for Human Pose Estimation [60.80423207808076]
Capturing long-range dependencies while preserving high-resolution visual representations is crucial for dense prediction tasks such as human pose estimation.<n>We propose the Dynamic Visual State Space (DVSS) block, which augments visual state space models with multi-scale convolutional operations.<n>We build HRVMamba, a novel model for efficient high-resolution representation learning.
arXiv Detail & Related papers (2024-10-04T06:19:29Z)
Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning [65.31677646659895]
This paper focuses on the concept of task-specific directions (TSDs)-critical for transitioning large models from pretrained states to task-specific enhancements in PEFT. We introduce a novel approach, LoRA-Dash, which aims to maximize the impact of TSDs during the fine-tuning process, thereby enhancing model performance on targeted tasks.
arXiv Detail & Related papers (2024-09-02T08:10:51Z)
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers [58.5711048151424]
We introduce SPARSEK Attention, a novel sparse attention mechanism designed to overcome computational and memory obstacles. Our approach integrates a scoring network and a differentiable top-k mask operator, SPARSEK, to select a constant number of KV pairs for each query. Experimental results reveal that SPARSEK Attention outperforms previous sparse attention methods.
arXiv Detail & Related papers (2024-06-24T15:55:59Z)
Large Language Model Agent as a Mechanical Designer [7.136205674624813]
We propose a framework that leverages a pretrained Large Language Model (LLM) in conjunction with an FEM module to autonomously generate, evaluate, and refine structural designs.<n>LLM operates without domain-specific fine-tuning, using general reasoning to propose design candidates, interpret FEM-derived performance metrics, and apply structurally sound modifications.<n>Compared to Non- Sorting Genetic Algorithm II (NSGA-II), our method achieves faster convergence and fewer FEM evaluations.
arXiv Detail & Related papers (2024-04-26T16:41:24Z)
Diffusion Model for Data-Driven Black-Box Optimization [54.25693582870226]
We focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization. We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons. Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models.
arXiv Detail & Related papers (2024-03-20T00:41:12Z)
OTOv3: Automatic Architecture-Agnostic Neural Network Training and Compression from Structured Pruning to Erasing Operators [57.145175475579315]
This topic spans various techniques, from structured pruning to neural architecture search, encompassing both pruning and erasing operators perspectives. We introduce the third-generation Only-Train-Once (OTOv3), which first automatically trains and compresses a general DNN through pruning and erasing operations. Our empirical results demonstrate the efficacy of OTOv3 across various benchmarks in structured pruning and neural architecture search.
arXiv Detail & Related papers (2023-12-15T00:22:55Z)
A Data-driven Recommendation Framework for Optimal Walker Designs [0.0]
This paper focuses on leveraging statistical modeling and machine learning to optimize a medical walker. To achieve the desirable qualities of a walker, we train a predictive machine-learning model to identify trade-offs between performance objectives. This paper presents potential walker designs that demonstrate up to a 30% mass reduction while increasing structural stability and integrity.
arXiv Detail & Related papers (2023-10-28T18:04:38Z)
Fairer and More Accurate Tabular Models Through NAS [14.147928131445852]
We propose using multi-objective Neural Architecture Search (NAS) and Hyperparameter Optimization (HPO) in the first application to the very challenging domain of tabular data. We show that models optimized solely for accuracy with NAS often fail to inherently address fairness concerns. We produce architectures that consistently dominate state-of-the-art bias mitigation methods either in fairness, accuracy or both.
arXiv Detail & Related papers (2023-10-18T17:56:24Z)
Multi-task Learning with 3D-Aware Regularization [55.97507478913053]
We propose a structured 3D-aware regularizer which interfaces multiple tasks through the projection of features extracted from an image encoder to a shared 3D feature space. We show that the proposed method is architecture agnostic and can be plugged into various prior multi-task backbones to improve their performance.
arXiv Detail & Related papers (2023-10-02T08:49:56Z)
E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive. We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation. Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z)
Multi-Agent Reinforcement Learning for Microprocessor Design Space Exploration [71.95914457415624]
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency. We propose an alternative formulation that leverages Multi-Agent RL (MARL) to tackle this problem. Our evaluation shows that the MARL formulation consistently outperforms single-agent RL baselines.
arXiv Detail & Related papers (2022-11-29T17:10:24Z)
Targeted Adaptive Design [0.0]
Modern manufacturing and advanced materials design often require searches of relatively high-dimensional process control parameter spaces. We describe targeted adaptive design (TAD), a new algorithm that performs this sampling task efficiently. TAD embodies the exploration-exploitation tension in a manner that recalls, but is essentially different from, Bayesian optimization and optimal experimental design.
arXiv Detail & Related papers (2022-05-27T19:29:24Z)
Consolidated learning -- a domain-specific model-free optimization strategy with examples for XGBoost and MIMIC-IV [4.370097023410272]
This paper proposes a new formulation of the tuning problem, called consolidated learning. In such settings, we are interested in the total optimization time rather than tuning for a single task. We demonstrate the effectiveness of this approach through an empirical study for XGBoost algorithm and the collection of predictive tasks extracted from the MIMIC-IV medical database.
arXiv Detail & Related papers (2022-01-27T21:38:53Z)
Twins: Revisiting Spatial Attention Design in Vision Transformers [81.02454258677714]
In this work, we demonstrate that a carefully-devised yet simple spatial attention mechanism performs favourably against the state-of-the-art schemes. We propose two vision transformer architectures, namely, Twins-PCPVT and Twins-SVT. Our proposed architectures are highly-efficient and easy to implement, only involving matrix multiplications that are highly optimized in modern deep learning frameworks.
arXiv Detail & Related papers (2021-04-28T15:42:31Z)
MO-PaDGAN: Reparameterizing Engineering Designs for Augmented Multi-objective Optimization [13.866787416457454]
Multi-objective optimization is key to solving many Engineering Design problems. Deep generative models can learn compact design representations. Mo-PaDGAN adds a Determinantal Point Processes based loss function to the generative adversarial network.
arXiv Detail & Related papers (2020-09-15T13:58:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.