Do DL models and training environments have an impact on energy
consumption?
- URL: http://arxiv.org/abs/2307.05520v3
- Date: Wed, 3 Jan 2024 15:20:31 GMT
- Title: Do DL models and training environments have an impact on energy
consumption?
- Authors: Santiago del Rey, Silverio Mart\'inez-Fern\'andez, Lu\'is Cruz, Xavier
Franch
- Abstract summary: This study aims to analyze the impact of the model architecture and training environment when training greener computer vision models.
We show that selecting the proper model architecture and training environment can reduce energy consumption dramatically.
- Score: 9.373015995433589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current research in the computer vision field mainly focuses on improving
Deep Learning (DL) correctness and inference time performance. However, there
is still little work on the huge carbon footprint that has training DL models.
This study aims to analyze the impact of the model architecture and training
environment when training greener computer vision models. We divide this goal
into two research questions. First, we analyze the effects of model
architecture on achieving greener models while keeping correctness at optimal
levels. Second, we study the influence of the training environment on producing
greener models. To investigate these relationships, we collect multiple metrics
related to energy efficiency and model correctness during the models' training.
Then, we outline the trade-offs between the measured energy efficiency and the
models' correctness regarding model architecture, and their relationship with
the training environment. We conduct this research in the context of a computer
vision system for image classification. In conclusion, we show that selecting
the proper model architecture and training environment can reduce energy
consumption dramatically (up to 81.38%) at the cost of negligible decreases in
correctness. Also, we find evidence that GPUs should scale with the models'
computational complexity for better energy efficiency.
Related papers
- Impact of ML Optimization Tactics on Greener Pre-Trained ML Models [46.78148962732881]
This study aims to (i) analyze image classification datasets and pre-trained models, (ii) improve inference efficiency by comparing optimized and non-optimized models, and (iii) assess the economic impact of the optimizations.
We conduct a controlled experiment to evaluate the impact of various PyTorch optimization techniques (dynamic quantization, torch.compile, local pruning, and global pruning) to 42 Hugging Face models for image classification.
Dynamic quantization demonstrates significant reductions in inference time and energy consumption, making it highly suitable for large-scale systems.
arXiv Detail & Related papers (2024-09-19T16:23:03Z) - Green AI: Exploring Carbon Footprints, Mitigation Strategies, and Trade Offs in Large Language Model Training [9.182429523979598]
We evaluate the CO2 emissions of well-known large language models, which have an especially high carbon footprint due to their significant amount of model parameters.
We argue for the training of LLMs in a way that is responsible and sustainable by suggesting measures for reducing carbon emissions.
arXiv Detail & Related papers (2024-04-01T15:01:45Z) - The Case for Co-Designing Model Architectures with Hardware [13.022505733049597]
We provide a set of guidelines for users to maximize the runtime performance of their transformer models.
We find the throughput of models with efficient model shapes is up to 39% higher.
arXiv Detail & Related papers (2024-01-25T19:50:31Z) - Asymmetric Masked Distillation for Pre-Training Small Foundation Models [52.56257450614992]
Self-supervised foundation models have shown great potential in computer vision thanks to the pre-training paradigm of masked autoencoding.
This paper focuses on pre-training relatively small vision transformer models that could be efficiently adapted to downstream tasks.
We propose a new asymmetric masked distillation (AMD) framework for pre-training relatively small models with autoencoding.
arXiv Detail & Related papers (2023-11-06T14:44:34Z) - A Comparative Study of Machine Learning Algorithms for Anomaly Detection
in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability.
Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance.
However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z) - Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data.
However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations.
This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z) - Uncovering Energy-Efficient Practices in Deep Learning Training:
Preliminary Steps Towards Green AI [8.025202812165412]
We consider energy consumption as a metric of equal importance to accuracy and to reduce any irrelevant tasks or energy usage.
We examine the training stage of the deep learning pipeline from a sustainability perspective.
We highlight innovative and promising energy-efficient practices for training deep learning models.
arXiv Detail & Related papers (2023-03-24T12:48:21Z) - Energy Efficiency of Training Neural Network Architectures: An Empirical
Study [11.325530936177493]
The evaluation of Deep Learning models has traditionally focused on criteria such as accuracy, F1 score, and related measures.
The computations needed to train such models entail a large carbon footprint.
We study the relations between DL model architectures and their environmental impact in terms of energy consumed and CO$$ emissions produced during training.
arXiv Detail & Related papers (2023-02-02T09:20:54Z) - Knowledge distillation: A good teacher is patient and consistent [71.14922743774864]
There is a growing discrepancy in computer vision between large-scale models that achieve state-of-the-art performance and models that are affordable in practical applications.
We identify certain implicit design choices, which may drastically affect the effectiveness of distillation.
We obtain a state-of-the-art ResNet-50 model for ImageNet, which achieves 82.8% top-1 accuracy.
arXiv Detail & Related papers (2021-06-09T17:20:40Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.