Related papers: Estimating Deep Learning energy consumption based on model architecture and training environment

Estimating Deep Learning energy consumption based on model architecture and training environment

URL: http://arxiv.org/abs/2307.05520v5
Date: Thu, 25 Sep 2025 08:25:25 GMT
Title: Estimating Deep Learning energy consumption based on model architecture and training environment
Authors: Santiago del Rey, Luís Cruz, Xavier Franch, Silverio Martínez-Fernández,
Abstract summary: We investigate how model architecture and training environment affect energy consumption.<n>We find that selecting the right model-training environment combination can reduce training energy consumption by up to 80.68%.<n>We propose the Stable Training Epoch Projection (STEP) and the Pre-training Regression-based Estimation (PRE) methods.
Score: 5.465797591588829
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To raise awareness of the environmental impact of deep learning (DL), many studies estimate the energy use of DL systems. However, energy estimates during DL training often rely on unverified assumptions. This work addresses that gap by investigating how model architecture and training environment affect energy consumption. We train a variety of computer vision models and collect energy consumption and accuracy metrics to analyze their trade-offs across configurations. Our results show that selecting the right model-training environment combination can reduce training energy consumption by up to 80.68% with less than 2% loss in $F_1$ score. We find a significant interaction effect between model and training environment: energy efficiency improves when GPU computational power scales with model complexity. Moreover, we demonstrate that common estimation practices, such as using FLOPs or GPU TDP, fail to capture these dynamics and can lead to substantial errors. To address these shortcomings, we propose the Stable Training Epoch Projection (STEP) and the Pre-training Regression-based Estimation (PRE) methods. Across evaluations, our methods outperform existing tools by a factor of two or more in estimation accuracy.

Related papers

The Environmental Impacts of Machine Learning Training Keep Rising Evidencing Rebound Effect [3.255583064724235]
We estimate the environmental impacts associated with training notable AI systems over the last decade.<n>Our analysis reveals two critical trends: First, the impacts of graphics cards production have increased steadily over this period.<n>We show that the impacts of hardware must be considered over the entire life cycle rather than the sole use phase in order to avoid impact shifting.
arXiv Detail & Related papers (2025-10-10T05:49:37Z)
Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
Energy Efficiency in AI for 5G and Beyond: A DeepRx Case Study [0.276240219662896]
We evaluate the energy consumption of DeepRX, a deep learning receiver based on a fully convolutional ResNet architecture.<n>Key contribution is the application of knowledge distillation to train a compact DeepRX student model that emulates the performance of the teacher model but with reduced energy consumption.<n>Performance is measured by comparing the Bit Error Rate (BER) performance versus Signal-to-Interference & Noise Ratio (SINR) values of the distilled model and a model trained from scratch.
arXiv Detail & Related papers (2025-07-14T15:54:06Z)
Energy Considerations for Large Pretrained Neural Networks [1.3812010983144798]
Complex neural network architectures require massive computational resources that consume substantial amounts of electricity.<n>Previous work has primarily focused on compressing models while retaining comparable model performance.<n>By quantifying the energy usage associated with both compressed and uncompressed models, we investigate compression as a means of reducing electricity consumption.<n>We find that pruning and low-rank factorization offer no significant improvements with respect to energy usage or other related statistics, while steganographic capacity reduction provides major benefits in almost every case.
arXiv Detail & Related papers (2025-06-02T04:39:24Z)
Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations [2.2765705959685234]
This study investigates the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines. We employ software-based power measurements to ensure ease of replication across diverse configurations, models, and datasets.
arXiv Detail & Related papers (2025-03-31T10:28:04Z)
Impact of ML Optimization Tactics on Greener Pre-Trained ML Models [46.78148962732881]
This study aims to (i) analyze image classification datasets and pre-trained models, (ii) improve inference efficiency by comparing optimized and non-optimized models, and (iii) assess the economic impact of the optimizations. We conduct a controlled experiment to evaluate the impact of various PyTorch optimization techniques (dynamic quantization, torch.compile, local pruning, and global pruning) to 42 Hugging Face models for image classification. Dynamic quantization demonstrates significant reductions in inference time and energy consumption, making it highly suitable for large-scale systems.
arXiv Detail & Related papers (2024-09-19T16:23:03Z)
Computing Within Limits: An Empirical Study of Energy Consumption in ML Training and Inference [2.553456266022126]
Machine learning (ML) has seen tremendous advancements, but its environmental footprint remains a concern. Acknowledging the growing environmental impact of ML this paper investigates Green ML.
arXiv Detail & Related papers (2024-06-20T13:59:34Z)
Watt For What: Rethinking Deep Learning's Energy-Performance Relationship [13.505163099299025]
We study the trade-off between model accuracy and electricity consumption of deep learning models. By evaluating accuracy per unit of electricity consumed, we demonstrate how smaller, more energy-efficient models can significantly expedite research. This research contributes to a more equitable research landscape, where smaller entities can compete effectively with larger counterparts.
arXiv Detail & Related papers (2023-10-10T11:08:31Z)
Scaling Laws for Sparsely-Connected Foundation Models [70.41266138010657]
We explore the impact of parameter sparsity on the scaling behavior of Transformers trained on massive datasets. We identify the first scaling law describing the relationship between weight sparsity, number of non-zero parameters, and amount of training data.
arXiv Detail & Related papers (2023-09-15T16:29:27Z)
A Comparative Study of Machine Learning Algorithms for Anomaly Detection in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability. Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance. However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z)
Robustness and Generalization Performance of Deep Learning Models on Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise. We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z)
Accuracy is not the only Metric that matters: Estimating the Energy Consumption of Deep Learning Models [33.45069308137142]
We have created an energy estimation pipeline1, which allows practitioners to estimate the energy needs of their models in advance, without actually running or training them. We accomplished this, by collecting high-quality energy data and building a first baseline model, capable of predicting the energy consumption of DL models by accumulating their estimated layer-wise energies.
arXiv Detail & Related papers (2023-04-03T11:35:10Z)
Uncovering Energy-Efficient Practices in Deep Learning Training: Preliminary Steps Towards Green AI [8.025202812165412]
We consider energy consumption as a metric of equal importance to accuracy and to reduce any irrelevant tasks or energy usage. We examine the training stage of the deep learning pipeline from a sustainability perspective. We highlight innovative and promising energy-efficient practices for training deep learning models.
arXiv Detail & Related papers (2023-03-24T12:48:21Z)
Counting Carbon: A Survey of Factors Influencing the Emissions of Machine Learning [77.62876532784759]
Machine learning (ML) requires using energy to carry out computations during the model training process. The generation of this energy comes with an environmental cost in terms of greenhouse gas emissions, depending on quantity used and the energy source. We present a survey of the carbon emissions of 95 ML models across time and different tasks in natural language processing and computer vision.
arXiv Detail & Related papers (2023-02-16T18:35:00Z)
Energy Efficiency of Training Neural Network Architectures: An Empirical Study [11.325530936177493]
The evaluation of Deep Learning models has traditionally focused on criteria such as accuracy, F1 score, and related measures. The computations needed to train such models entail a large carbon footprint. We study the relations between DL model architectures and their environmental impact in terms of energy consumed and CO$$ emissions produced during training.
arXiv Detail & Related papers (2023-02-02T09:20:54Z)
Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models. Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely. Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z)
Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision [31.781943982148025]
We present the first large-scale energy consumption benchmark for efficient computer vision models. A new metric is proposed to explicitly evaluate the full-cycle energy consumption under different model usage intensity.
arXiv Detail & Related papers (2021-08-30T18:22:36Z)
HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing [76.38975568873765]
We introduce HULK, a multi-task energy efficiency benchmarking platform for responsible natural language processing. We compare pretrained models' energy efficiency from the perspectives of time and cost.
arXiv Detail & Related papers (2020-02-14T01:04:19Z)
NeurOpt: Neural network based optimization for building energy management and climate control [58.06411999767069]
We propose a data-driven control algorithm based on neural networks to reduce this cost of model identification. We validate our learning and control algorithms on a two-story building with ten independently controlled zones, located in Italy.
arXiv Detail & Related papers (2020-01-22T00:51:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.