The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
- URL: http://arxiv.org/abs/2505.06371v1
- Date: Fri, 09 May 2025 18:27:32 GMT
- Title: The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
- Authors: Jae-Won Chung, Jiachen Liu, Jeff J. Ma, Ruofan Wu, Oh Jun Kweon, Yuxuan Xia, Zhiyu Wu, Mosharaf Chowdhury,
- Abstract summary: Energy remains a metric that is often overlooked, under-explored, or poorly understood in the context of building ML systems.<n>We present the ML.ENERGY Benchmark, a benchmark suite and tool for measuring inference energy consumption under realistic service environments.
- Score: 18.675499212393785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As the adoption of Generative AI in real-world services grow explosively, energy has emerged as a critical bottleneck resource. However, energy remains a metric that is often overlooked, under-explored, or poorly understood in the context of building ML systems. We present the ML.ENERGY Benchmark, a benchmark suite and tool for measuring inference energy consumption under realistic service environments, and the corresponding ML.ENERGY Leaderboard, which have served as a valuable resource for those hoping to understand and optimize the energy consumption of their generative AI services. In this paper, we explain four key design principles for benchmarking ML energy we have acquired over time, and then describe how they are implemented in the ML.ENERGY Benchmark. We then highlight results from the latest iteration of the benchmark, including energy measurements of 40 widely used model architectures across 6 different tasks, case studies of how ML design choices impact energy consumption, and how automated optimization recommendations can lead to significant (sometimes more than 40%) energy savings without changing what is being computed by the model. The ML.ENERGY Benchmark is open-source and can be easily extended to various customized models and application scenarios.
Related papers
- EfficientLLM: Efficiency in Large Language Models [64.3537131208038]
Large Language Models (LLMs) have driven significant progress, yet their growing counts and context windows incur prohibitive compute, energy, and monetary costs.<n>We introduce EfficientLLM, a novel benchmark and the first comprehensive empirical study evaluating efficiency techniques for LLMs at scale.
arXiv Detail & Related papers (2025-05-20T02:27:08Z) - Sustainable LLM Inference for Edge AI: Evaluating Quantized LLMs for Energy Efficiency, Output Accuracy, and Inference Latency [6.306413686006502]
We conduct a comprehensive analysis of 28 quantized Large Language Models (LLMs) from the Ollama library.<n>We evaluate energy efficiency, inference performance, and output accuracy across multiple quantization levels and task types.<n>Our findings reveal the trade-offs between energy efficiency, inference speed, and accuracy in different quantization settings.
arXiv Detail & Related papers (2025-04-04T11:29:30Z) - Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations [2.2765705959685234]
This study investigates the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines.<n>We employ software-based power measurements to ensure ease of replication across diverse configurations, models, and datasets.
arXiv Detail & Related papers (2025-03-31T10:28:04Z) - MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI [5.50579824344998]
Machine learning (ML) technologies have led to a surge in power consumption across diverse systems.<n>This paper introduceserf Power, a comprehensive benchmarking methodology to evaluate the energy efficiency of ML systems at power levels ranging from microwatts to megawatts.
arXiv Detail & Related papers (2024-10-15T20:06:33Z) - Impact of ML Optimization Tactics on Greener Pre-Trained ML Models [46.78148962732881]
This study aims to (i) analyze image classification datasets and pre-trained models, (ii) improve inference efficiency by comparing optimized and non-optimized models, and (iii) assess the economic impact of the optimizations.
We conduct a controlled experiment to evaluate the impact of various PyTorch optimization techniques (dynamic quantization, torch.compile, local pruning, and global pruning) to 42 Hugging Face models for image classification.
Dynamic quantization demonstrates significant reductions in inference time and energy consumption, making it highly suitable for large-scale systems.
arXiv Detail & Related papers (2024-09-19T16:23:03Z) - Normalizing Energy Consumption for Hardware-Independent Evaluation [9.658615045493734]
We present a novel methodology for normalizing energy consumption across different hardware platforms.
Our approach shows that the number of reference points, the type of regression and the inclusion of computational metrics significantly influences the normalization process.
arXiv Detail & Related papers (2024-09-09T13:38:00Z) - Computing Within Limits: An Empirical Study of Energy Consumption in ML Training and Inference [2.553456266022126]
Machine learning (ML) has seen tremendous advancements, but its environmental footprint remains a concern.
Acknowledging the growing environmental impact of ML this paper investigates Green ML.
arXiv Detail & Related papers (2024-06-20T13:59:34Z) - Power Hungry Processing: Watts Driving the Cost of AI Deployment? [74.19749699665216]
generative, multi-purpose AI systems promise a unified approach to building machine learning (ML) models into technology.
This ambition of generality'' comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit.
We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models.
We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions
arXiv Detail & Related papers (2023-11-28T15:09:36Z) - LEAF + AIO: Edge-Assisted Energy-Aware Object Detection for Mobile
Augmented Reality [77.00418462388525]
Mobile augmented reality (MAR) applications are significantly energy-guzzling.
We design an edge-based energy-aware MAR system that enables MAR devices to dynamically change their configurations.
Our proposed dynamic MAR configuration adaptations can minimize the per frame energy consumption of multiple MAR clients.
arXiv Detail & Related papers (2022-05-27T06:11:50Z) - Automated Machine Learning: A Case Study on Non-Intrusive Appliance Load Monitoring [81.06807079998117]
We propose a novel approach to enable Automated Machine Learning (AutoML) for Non-Intrusive Appliance Load Monitoring (NIALM)<n>NIALM offers a cost-effective alternative to smart meters for measuring the energy consumption of electric devices and appliances.
arXiv Detail & Related papers (2022-03-06T10:12:56Z) - Multi-Agent Meta-Reinforcement Learning for Self-Powered and Sustainable
Edge Computing Systems [87.4519172058185]
An effective energy dispatch mechanism for self-powered wireless networks with edge computing capabilities is studied.
A novel multi-agent meta-reinforcement learning (MAMRL) framework is proposed to solve the formulated problem.
Experimental results show that the proposed MAMRL model can reduce up to 11% non-renewable energy usage and by 22.4% the energy cost.
arXiv Detail & Related papers (2020-02-20T04:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.