Related papers: MAx-DNN: Multi-Level Arithmetic Approximation for Energy-Efficient DNN Hardware Accelerators

MAx-DNN: Multi-Level Arithmetic Approximation for Energy-Efficient DNN Hardware Accelerators

URL: http://arxiv.org/abs/2506.21371v1
Date: Thu, 26 Jun 2025 15:21:12 GMT
Title: MAx-DNN: Multi-Level Arithmetic Approximation for Energy-Efficient DNN Hardware Accelerators
Authors: Vasileios Leon, Georgios Makris, Sotirios Xydis, Kiamal Pekmestzi, Dimitrios Soudris,
Abstract summary: This paper examines the interplay of fine-grained error resilience of DNN workloads to achieve higher levels of energy efficiency.<n>We use the ResNet-8 model on the CIFAR-10 dataset to evaluate our approximations.<n>The proposed solution delivers up to 54% energy gains in exchange for up to 4% accuracy loss, compared to the baseline quantized model.
Score: 5.5348061557491794
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Nowadays, the rapid growth of Deep Neural Network (DNN) architectures has established them as the defacto approach for providing advanced Machine Learning tasks with excellent accuracy. Targeting low-power DNN computing, this paper examines the interplay of fine-grained error resilience of DNN workloads in collaboration with hardware approximation techniques, to achieve higher levels of energy efficiency. Utilizing the state-of-the-art ROUP approximate multipliers, we systematically explore their fine-grained distribution across the network according to our layer-, filter-, and kernel-level approaches, and examine their impact on accuracy and energy. We use the ResNet-8 model on the CIFAR-10 dataset to evaluate our approximations. The proposed solution delivers up to 54% energy gains in exchange for up to 4% accuracy loss, compared to the baseline quantized model, while it provides 2x energy gains with better accuracy versus the state-of-the-art DNN approximations.

Related papers

Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.<n>A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.<n>The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs.<n>We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections.<n>Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z)
DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing. Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time. We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z)
LitE-SNN: Designing Lightweight and Efficient Spiking Neural Network through Spatial-Temporal Compressive Network Search and Joint Optimization [48.41286573672824]
Spiking Neural Networks (SNNs) mimic the information-processing mechanisms of the human brain and are highly energy-efficient. We propose a new approach named LitE-SNN that incorporates both spatial and temporal compression into the automated network design process.
arXiv Detail & Related papers (2024-01-26T05:23:11Z)
Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision Quantization [1.0235078178220354]
We propose an automated framework to compress Deep Neural Networks (DNNs) in a hardware-aware manner by jointly employing pruning and quantization. Our framework achieves $39%$ average energy reduction for datasets $1.7%$ average accuracy loss and outperforms significantly the state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-23T18:50:13Z)
Energy-Efficient On-Board Radio Resource Management for Satellite Communications via Neuromorphic Computing [59.40731173370976]
We investigate the application of energy-efficient brain-inspired machine learning models for on-board radio resource management. For relevant workloads, spiking neural networks (SNNs) implemented on Loihi 2 yield higher accuracy, while reducing power consumption by more than 100$times$ as compared to the CNN-based reference platform.
arXiv Detail & Related papers (2023-08-22T03:13:57Z)
The Hardware Impact of Quantization and Pruning for Weights in Spiking Neural Networks [0.368986335765876]
quantization and pruning of parameters can both compress the model size, reduce memory footprints, and facilitate low-latency execution. We study various combinations of pruning and quantization in isolation, cumulatively, and simultaneously to a state-of-the-art SNN targeting gesture recognition. We show that this state-of-the-art model is amenable to aggressive parameter quantization, not suffering from any loss in accuracy down to ternary weights.
arXiv Detail & Related papers (2023-02-08T16:25:20Z)
Fast Exploration of the Impact of Precision Reduction on Spiking Neural Networks [63.614519238823206]
Spiking Neural Networks (SNNs) are a practical choice when the target hardware reaches the edge of computing. We employ an Interval Arithmetic (IA) model to develop an exploration methodology that takes advantage of the capability of such a model to propagate the approximation error.
arXiv Detail & Related papers (2022-11-22T15:08:05Z)
Energy-efficient DNN Inference on Approximate Accelerators Through Formal Property Exploration [1.0323063834827415]
We present an automated framework for weight-to-approximation mapping for approximate Deep Neural Networks (DNNs) At the MAC unit level, our evaluation surpassed already energy-efficient mappings by more than $times2$ in terms of energy gains.
arXiv Detail & Related papers (2022-07-25T17:07:00Z)
Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey [4.856755747052137]
Deep Neural Networks (DNNs) are very popular because of their high performance in various cognitive tasks in Machine Learning (ML) Recent advancements in DNNs have brought beyond human accuracy in many tasks, but at the cost of high computational complexity. This article provides a comprehensive survey and analysis of hardware approximation techniques for DNN accelerators.
arXiv Detail & Related papers (2022-03-16T16:33:13Z)
Enhanced physics-constrained deep neural networks for modeling vanadium redox flow battery [62.997667081978825]
We propose an enhanced version of the physics-constrained deep neural network (PCDNN) approach to provide high-accuracy voltage predictions. The ePCDNN can accurately capture the voltage response throughout the charge--discharge cycle, including the tail region of the voltage discharge curve.
arXiv Detail & Related papers (2022-03-03T19:56:24Z)
Positive/Negative Approximate Multipliers for DNN Accelerators [3.1921317895626493]
We present a filter-oriented approximation method to map the weights to the appropriate modes of the approximate multiplier. Our approach achieves 18.33% energy gains on average across 7 NNs on 4 different datasets for a maximum accuracy drop of only 1%.
arXiv Detail & Related papers (2021-07-20T09:36:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.