Related papers: Minimizing Energy Consumption of Deep Learning Models by Energy-Aware Training

Minimizing Energy Consumption of Deep Learning Models by Energy-Aware Training

URL: http://arxiv.org/abs/2307.00368v1
Date: Sat, 1 Jul 2023 15:44:01 GMT
Title: Minimizing Energy Consumption of Deep Learning Models by Energy-Aware Training
Authors: Dario Lazzaro, Antonio Emanuele Cin\`a, Maura Pintor, Ambra Demontis, Battista Biggio, Fabio Roli, Marcello Pelillo
Abstract summary: We propose EAT, a gradient-based algorithm that aims to reduce energy consumption during model training. We demonstrate that our energy-aware training algorithm EAT is able to train networks with a better trade-off between classification performance and energy efficiency.
Score: 26.438415753870917
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning models undergo a significant increase in the number of parameters they possess, leading to the execution of a larger number of operations during inference. This expansion significantly contributes to higher energy consumption and prediction latency. In this work, we propose EAT, a gradient-based algorithm that aims to reduce energy consumption during model training. To this end, we leverage a differentiable approximation of the $\ell_0$ norm, and use it as a sparse penalty over the training loss. Through our experimental analysis conducted on three datasets and two deep neural networks, we demonstrate that our energy-aware training algorithm EAT is able to train networks with a better trade-off between classification performance and energy efficiency.

Related papers

An Overview of Low-Rank Structures in the Training and Adaptation of Large Models [52.67110072923365]
Recent research has uncovered a widespread phenomenon in deep networks: the emergence of low-rank structures. These implicit low-dimensional patterns provide valuable insights for improving the efficiency of training and fine-tuning large-scale models. We present a comprehensive review of advances in exploiting low-rank structures for deep learning and shed light on their mathematical foundations.
arXiv Detail & Related papers (2025-03-25T17:26:09Z)
Revisiting DNN Training for Intermittently Powered Energy Harvesting Micro Computers [0.6721767679705013]
This study introduces and evaluates a novel training methodology tailored for Deep Neural Networks in energy-constrained environments. We propose a dynamic dropout technique that adapts to both the architecture of the device and the variability in energy availability. Preliminary results demonstrate that this strategy provides 6 to 22 percent accuracy improvements compared to the state of the art with less than 5 percent additional compute.
arXiv Detail & Related papers (2024-08-25T01:13:00Z)
Towards Physical Plausibility in Neuroevolution Systems [0.276240219662896]
The increasing usage of Artificial Intelligence (AI) models, especially Deep Neural Networks (DNNs), is increasing the power consumption during training and inference. This work addresses the growing energy consumption problem in Machine Learning (ML) Even a slight reduction in power usage can lead to significant energy savings, benefiting users, companies, and the environment.
arXiv Detail & Related papers (2024-01-31T10:54:34Z)
Uncovering Energy-Efficient Practices in Deep Learning Training: Preliminary Steps Towards Green AI [8.025202812165412]
We consider energy consumption as a metric of equal importance to accuracy and to reduce any irrelevant tasks or energy usage. We examine the training stage of the deep learning pipeline from a sustainability perspective. We highlight innovative and promising energy-efficient practices for training deep learning models.
arXiv Detail & Related papers (2023-03-24T12:48:21Z)
Energy Efficiency of Training Neural Network Architectures: An Empirical Study [11.325530936177493]
The evaluation of Deep Learning models has traditionally focused on criteria such as accuracy, F1 score, and related measures. The computations needed to train such models entail a large carbon footprint. We study the relations between DL model architectures and their environmental impact in terms of energy consumed and CO$$ emissions produced during training.
arXiv Detail & Related papers (2023-02-02T09:20:54Z)
Energy-based Latent Aligner for Incremental Learning [83.0135278697976]
Deep learning models tend to forget their earlier knowledge while incrementally learning new tasks. This behavior emerges because the parameter updates optimized for the new tasks may not align well with the updates suitable for older tasks. We propose ELI: Energy-based Latent Aligner for Incremental Learning.
arXiv Detail & Related papers (2022-03-28T17:57:25Z)
Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models. Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely. Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z)
Compute and Energy Consumption Trends in Deep Learning Inference [67.32875669386488]
We study relevant models in the areas of computer vision and natural language processing. For a sustained increase in performance we see a much softer growth in energy consumption than previously anticipated.
arXiv Detail & Related papers (2021-09-12T09:40:18Z)
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z)
Compute, Time and Energy Characterization of Encoder-Decoder Networks with Automatic Mixed Precision Training [6.761235154230549]
We show that it is possible to achieve a significant improvement in training time by leveraging mixed-precision training without sacrificing model performance. We find that a 1549% increase in the number of trainable parameters for a network comes at a relatively smaller 63.22% increase in energy usage for a UNet with 4 encoding layers.
arXiv Detail & Related papers (2020-08-18T17:44:24Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
Large Batch Training Does Not Need Warmup [111.07680619360528]
Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications. In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training. Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques.
arXiv Detail & Related papers (2020-02-04T23:03:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.