Related papers: ESM: A Framework for Building Effective Surrogate Models for Hardware-Aware Neural Architecture Search

ESM: A Framework for Building Effective Surrogate Models for Hardware-Aware Neural Architecture Search

URL: http://arxiv.org/abs/2508.01505v1
Date: Sat, 02 Aug 2025 22:06:39 GMT
Title: ESM: A Framework for Building Effective Surrogate Models for Hardware-Aware Neural Architecture Search
Authors: Azaz-Ur-Rehman Nasir, Samroz Ahmad Shoaib, Muhammad Abdullah Hanif, Muhammad Shafique,
Abstract summary: Hardware-aware Neural Architecture Search (NAS) is one of the most promising techniques for designing efficient Deep Neural Networks (DNNs) for resource-constrained devices.<n>We study different types of surrogate models and highlight their strengths and weaknesses.<n>We present a holistic framework that enables reliable dataset generation and efficient model generation, considering the overall costs of different stages of the model generation pipeline.
Score: 4.9276746621153285
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hardware-aware Neural Architecture Search (NAS) is one of the most promising techniques for designing efficient Deep Neural Networks (DNNs) for resource-constrained devices. Surrogate models play a crucial role in hardware-aware NAS as they enable efficient prediction of performance characteristics (e.g., inference latency and energy consumption) of different candidate models on the target hardware device. In this paper, we focus on building hardware-aware latency prediction models. We study different types of surrogate models and highlight their strengths and weaknesses. We perform a systematic analysis to understand the impact of different factors that can influence the prediction accuracy of these models, aiming to assess the importance of each stage involved in the model designing process and identify methods and policies necessary for designing/training an effective estimation model, specifically for GPU-powered devices. Based on the insights gained from the analysis, we present a holistic framework that enables reliable dataset generation and efficient model generation, considering the overall costs of different stages of the model generation pipeline.

Related papers

Machine-Learning-Assisted Photonic Device Development: A Multiscale Approach from Theory to Characterization [80.82828320306464]
Photonic device development (PDD) has achieved remarkable success in designing and implementing new devices for controlling light across various wavelengths, scales, and applications.<n>PDD is an iterative, five-step process that consists of: i.e. deriving device behavior from design parameters, ii. simulating device performance, iv. fabricating the optimal device, and v. measuring device performance.<n>PDD suffers from large optimization landscapes, uncertainties in structural or optical characterization, and difficulties in implementing robust fabrication processes.<n>In this review, we present a comprehensive perspective on these methods to enable machine-learning-assisted PDD
arXiv Detail & Related papers (2025-06-24T23:32:54Z)
A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.<n> deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.<n>This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation [2.3636539018632616]
This work empirically investigates the optimization of complex deep learning models to analyze their functionality on an embedded device. It evaluates the effectiveness of the optimized models in terms of their inference speed for image classification and video action detection.
arXiv Detail & Related papers (2024-06-25T17:34:52Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
Enhanced LFTSformer: A Novel Long-Term Financial Time Series Prediction Model Using Advanced Feature Engineering and the DS Encoder Informer Architecture [0.8532753451809455]
This study presents a groundbreaking model for forecasting long-term financial time series, termed the Enhanced LFTSformer. The model distinguishes itself through several significant innovations. Systematic experimentation on a range of benchmark stock market datasets demonstrates that the Enhanced LFTSformer outperforms traditional machine learning models.
arXiv Detail & Related papers (2023-10-03T08:37:21Z)
Robustness and Generalization Performance of Deep Learning Models on Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise. We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z)
Statistical Hardware Design With Multi-model Active Learning [1.7596501992526474]
We propose a model-based active learning approach to solve the problem of designing efficient hardware. Our proposed method provides hardware models that are sufficiently accurate to perform design space exploration as well as performance prediction simultaneously.
arXiv Detail & Related papers (2023-03-14T16:37:38Z)
An Intelligent End-to-End Neural Architecture Search Framework for Electricity Forecasting Model Development [4.940941112226529]
We propose an intelligent automated architecture search (IAAS) framework for the development of time-series electricity forecasting models. The proposed framework contains three primary components, i.e., network function-preserving transformation operation, reinforcement learning (RL)-based network transformation control, and network screening. We demonstrate that the proposed IAAS framework significantly outperforms the ten existing models or methods in terms of forecasting accuracy and stability.
arXiv Detail & Related papers (2022-03-25T10:36:27Z)
Balancing Accuracy and Latency in Multipath Neural Networks [0.09668407688201358]
We use a one-shot neural architecture search model to implicitly evaluate the performance of an intractable number of neural networks. We show that our method can accurately model the relative performance between models with different latencies and predict the performance of unseen models with good precision across different datasets.
arXiv Detail & Related papers (2021-04-25T00:05:48Z)
Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling [86.9726984929758]
We focus on the integration of incomplete physics models into deep generative models. We propose a VAE architecture in which a part of the latent space is grounded by physics. We demonstrate generative performance improvements over a set of synthetic and real-world datasets.
arXiv Detail & Related papers (2021-02-25T20:28:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.