Related papers: Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models

Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models

URL: http://arxiv.org/abs/2205.09646v1
Date: Thu, 19 May 2022 16:03:55 GMT
Title: Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models
Authors: Joseph McDonald, Baolin Li, Nathan Frey, Devesh Tiwari, Vijay Gadepally, Siddharth Samsi
Abstract summary: We investigate techniques that can be used to reduce the energy consumption of common NLP applications. These techniques can lead to significant reduction in energy consumption when training language models or their use for inference.
Score: 8.927248087602942
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The energy requirements of current natural language processing models continue to grow at a rapid, unsustainable pace. Recent works highlighting this problem conclude there is an urgent need for methods that reduce the energy needs of NLP and machine learning more broadly. In this article, we investigate techniques that can be used to reduce the energy consumption of common NLP applications. In particular, we focus on techniques to measure energy usage and different hardware and datacenter-oriented settings that can be tuned to reduce energy consumption for training and inference for language models. We characterize the impact of these settings on metrics such as computational performance and energy consumption through experiments conducted on a high performance computing system as well as popular cloud computing platforms. These techniques can lead to significant reduction in energy consumption when training language models or their use for inference. For example, power-capping, which limits the maximum power a GPU can consume, can enable a 15\% decrease in energy usage with marginal increase in overall computation time when training a transformer-based language model.

Related papers

Towards Green AI-Native Networks: Evaluation of Neural Circuit Policy for Estimating Energy Consumption of Base Stations [5.466248014150832]
Optimization of radio hardware and AI-based network management software yield significant energy savings in radio access networks. executing underlying Machine Learning (ML) models may require additional compute and energy. This work evaluates the novel use of sparsely structured Neural Circuit Policies (NCPs) in a use case to estimate the energy consumption of base stations.
arXiv Detail & Related papers (2025-04-03T17:22:39Z)
THOR: A Generic Energy Estimation Approach for On-Device Training [34.57867978862375]
THOR is a generic approach for energy consumption estimation in deep neural network (DNN) training. We conduct extensive experiments with various types of models across different real-world platforms. The results demonstrate that THOR has effectively reduced the Mean Absolute Percentage Error (MAPE) by up to 30%.
arXiv Detail & Related papers (2025-01-27T03:29:02Z)
Impact of ML Optimization Tactics on Greener Pre-Trained ML Models [46.78148962732881]
This study aims to (i) analyze image classification datasets and pre-trained models, (ii) improve inference efficiency by comparing optimized and non-optimized models, and (iii) assess the economic impact of the optimizations. We conduct a controlled experiment to evaluate the impact of various PyTorch optimization techniques (dynamic quantization, torch.compile, local pruning, and global pruning) to 42 Hugging Face models for image classification. Dynamic quantization demonstrates significant reductions in inference time and energy consumption, making it highly suitable for large-scale systems.
arXiv Detail & Related papers (2024-09-19T16:23:03Z)
Towards Physical Plausibility in Neuroevolution Systems [0.276240219662896]
The increasing usage of Artificial Intelligence (AI) models, especially Deep Neural Networks (DNNs), is increasing the power consumption during training and inference. This work addresses the growing energy consumption problem in Machine Learning (ML) Even a slight reduction in power usage can lead to significant energy savings, benefiting users, companies, and the environment.
arXiv Detail & Related papers (2024-01-31T10:54:34Z)
Optimal Power Flow in Highly Renewable Power System Based on Attention Neural Networks [43.19619268243832]
The integration of renewable energy sources, like wind and solar, poses challenges due to their inherent variability. This variability, driven largely by changing weather conditions, demands frequent recalibrations of power settings. We present a cutting-edge, physics-informed machine learning methodology, trained using imitation learning and historical European weather datasets.
arXiv Detail & Related papers (2023-11-23T12:02:58Z)
Computation-efficient Deep Learning for Computer Vision: A Survey [121.84121397440337]
Deep learning models have reached or even exceeded human-level performance in a range of visual perception tasks. Deep learning models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios. New research focus is computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference.
arXiv Detail & Related papers (2023-08-27T03:55:28Z)
Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A Multi-Agent Reinforcement Learning Approach [48.18355658448509]
Recent breakthroughs in generative artificial intelligence have triggered a surge in demand for machine learning training, which poses significant cost burdens and environmental challenges due to its substantial energy consumption. Scheduling training jobs among geographically distributed cloud data centers unveils the opportunity to optimize the usage of computing capacity powered by inexpensive and low-carbon energy. We propose an algorithm based on multi-agent reinforcement learning and actor-critic methods to learn the optimal collaborative scheduling strategy through interacting with a cloud system built with real-life workload patterns, energy prices, and carbon intensities.
arXiv Detail & Related papers (2023-04-17T02:12:30Z)
Energy Transformer [64.22957136952725]
Our work combines aspects of three promising paradigms in machine learning, namely, attention mechanism, energy-based models, and associative memory. We propose a novel architecture, called the Energy Transformer (or ET for short), that uses a sequence of attention layers that are purposely designed to minimize a specifically engineered energy function.
arXiv Detail & Related papers (2023-02-14T18:51:22Z)
Precise Energy Consumption Measurements of Heterogeneous Artificial Intelligence Workloads [0.534434568021034]
We present measurements of the energy consumption of two typical applications of deep learning models on different types of compute nodes. One advantage of our approach is that the information on energy consumption is available to all users of the supercomputer.
arXiv Detail & Related papers (2022-12-03T21:40:55Z)
Compute and Energy Consumption Trends in Deep Learning Inference [67.32875669386488]
We study relevant models in the areas of computer vision and natural language processing. For a sustained increase in performance we see a much softer growth in energy consumption than previously anticipated.
arXiv Detail & Related papers (2021-09-12T09:40:18Z)
Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision [31.781943982148025]
We present the first large-scale energy consumption benchmark for efficient computer vision models. A new metric is proposed to explicitly evaluate the full-cycle energy consumption under different model usage intensity.
arXiv Detail & Related papers (2021-08-30T18:22:36Z)
EnergyVis: Interactively Tracking and Exploring Energy Consumption for ML Models [8.939420322774243]
EnergyVis is an interactive energy consumption tracker for machine learning (ML) models. It enables researchers to interactively track, visualize and compare model energy consumption across key energy consumption and carbon footprint metrics. EnergyVis aims to raise awareness concerning computational sustainability by interactively highlighting excessive energy usage during model training.
arXiv Detail & Related papers (2021-03-30T15:33:43Z)
Resource-Constrained On-Device Learning by Dynamic Averaging [7.720999661966942]
Communication between data-generating devices is partially responsible for a growing portion of the world's power consumption. For machine learning, on-device learning avoids sending raw data, which can reduce communication substantially. This paper investigates an approach to communication-efficient on-device learning of integer exponential families executed on low-power processors.
arXiv Detail & Related papers (2020-09-25T09:29:10Z)
HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing [76.38975568873765]
We introduce HULK, a multi-task energy efficiency benchmarking platform for responsible natural language processing. We compare pretrained models' energy efficiency from the perspectives of time and cost.
arXiv Detail & Related papers (2020-02-14T01:04:19Z)
Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning [68.37641996188133]
We introduce a framework for tracking realtime energy consumption and carbon emissions. We create a leaderboard for energy efficient reinforcement learning algorithms. We propose strategies for mitigation of carbon emissions and reduction of energy consumption.
arXiv Detail & Related papers (2020-01-31T05:12:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.