Power Hungry Processing: Watts Driving the Cost of AI Deployment?
- URL: http://arxiv.org/abs/2311.16863v3
- Date: Tue, 15 Oct 2024 20:54:08 GMT
- Title: Power Hungry Processing: Watts Driving the Cost of AI Deployment?
- Authors: Alexandra Sasha Luccioni, Yacine Jernite, Emma Strubell,
- Abstract summary: generative, multi-purpose AI systems promise a unified approach to building machine learning (ML) models into technology.
This ambition of generality'' comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit.
We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models.
We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions
- Score: 74.19749699665216
- License:
- Abstract: Recent years have seen a surge in the popularity of commercial AI products based on generative, multi-purpose AI systems promising a unified approach to building machine learning (ML) models into technology. However, this ambition of ``generality'' comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit. In this work, we propose the first systematic comparison of the ongoing inference cost of various categories of ML systems, covering both task-specific (i.e. finetuned models that carry out a single task) and `general-purpose' models, (i.e. those trained for multiple tasks). We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models. We find that multi-purpose, generative architectures are orders of magnitude more expensive than task-specific systems for a variety of tasks, even when controlling for the number of model parameters. We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions. All the data from our study can be accessed via an interactive demo to carry out further exploration and analysis.
Related papers
- Pricing and Competition for Generative AI [3.8677478583601776]
We explore the problem of how developers of new generative AI software can release and price their technology.
We first develop a comparison of two different models for a specific task with respect to user cost-effectiveness.
We then model the pricing problem of generative AI software as a game between two different companies.
arXiv Detail & Related papers (2024-11-04T22:52:45Z) - The Energy Cost of Artificial Intelligence of Things Lifecycle [0.47998222538650537]
We propose a new metric to capture the overall energy cost of inference over the lifecycle of an AIoT system.
With eCAL we show that the better a model is and the more it is used, the more energy efficient an inference is.
We also evaluate the Carbon Footprint of the AIoT system by calculating the equivalent CO$_2$ emissions based on the energy consumption and the Carbon Intensity (CI) across different countries.
arXiv Detail & Related papers (2024-08-01T13:23:15Z) - Toward Cross-Layer Energy Optimizations in AI Systems [4.871463967255196]
Energy efficiency is likely to become the gating factor toward adoption of artificial intelligence.
With the pervasive usage of artificial intelligence (AI) and machine learning (ML) tools and techniques, their energy efficiency is likely to become the gating factor toward adoption.
This is because generative AI (GenAI) models are massive energy hogs.
Inference consumes even more energy, because a model trained once serve millions.
arXiv Detail & Related papers (2024-04-10T01:35:17Z) - From Words to Watts: Benchmarking the Energy Costs of Large Language
Model Inference [19.439683873290623]
Large language models (LLMs) have exploded in popularity due to their new generative capabilities that go far beyond prior state-of-the-art.
These models carry significant computational challenges, especially the compute and energy costs required for inference.
arXiv Detail & Related papers (2023-10-04T17:41:59Z) - Cheaply Evaluating Inference Efficiency Metrics for Autoregressive
Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing.
LLMs are extremely computationally expensive, even at inference time.
We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - Counting Carbon: A Survey of Factors Influencing the Emissions of
Machine Learning [77.62876532784759]
Machine learning (ML) requires using energy to carry out computations during the model training process.
The generation of this energy comes with an environmental cost in terms of greenhouse gas emissions, depending on quantity used and the energy source.
We present a survey of the carbon emissions of 95 ML models across time and different tasks in natural language processing and computer vision.
arXiv Detail & Related papers (2023-02-16T18:35:00Z) - A Generative Approach for Production-Aware Industrial Network Traffic
Modeling [70.46446906513677]
We investigate the network traffic data generated from a laser cutting machine deployed in a Trumpf factory in Germany.
We analyze the traffic statistics, capture the dependencies between the internal states of the machine, and model the network traffic as a production state dependent process.
We compare the performance of various generative models including variational autoencoder (VAE), conditional variational autoencoder (CVAE), and generative adversarial network (GAN)
arXiv Detail & Related papers (2022-11-11T09:46:58Z) - EVE: Environmental Adaptive Neural Network Models for Low-power Energy
Harvesting System [8.16411986220709]
Energy harvesting technology that harvests energy from ambient environment is a promising alternative to batteries for powering those devices.
This paper proposes EVE, an automated machine learning framework to search for desired multi-models with shared weights for energy harvesting IoT devices.
Experimental results show that the neural networks models generated by EVE is on average 2.5X faster than the baseline models without pruning and shared weights.
arXiv Detail & Related papers (2022-07-14T20:53:46Z) - Machine learning applications for electricity market agent-based models:
A systematic literature review [68.8204255655161]
Agent-based simulations are used to better understand the dynamics of the electricity market.
Agent-based models provide the opportunity to integrate machine learning and artificial intelligence.
We review 55 papers published between 2016 and 2021 which focus on machine learning applied to agent-based electricity market models.
arXiv Detail & Related papers (2022-06-05T14:52:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.