Magneton: Optimizing Energy Efficiency of ML Systems via Differential Energy Debugging
- URL: http://arxiv.org/abs/2512.08365v1
- Date: Tue, 09 Dec 2025 08:41:16 GMT
- Title: Magneton: Optimizing Energy Efficiency of ML Systems via Differential Energy Debugging
- Authors: Yi Pan, Wenbo Qian, Dedong Xie, Ruiyan Hu, Yigong Hu, Baris Kasikci,
- Abstract summary: A significant but overlooked source of inefficiency is software energy waste caused by poor software design.<n>These inefficiencies arise in widely used ML frameworks and applications, yet developers often lack the visibility and tools to detect and diagnose them.<n>We propose differential energy debug, a novel approach that leverages the observation that competing ML systems often implement similar functionality with vastly different energy consumption.<n>Building on this insight, we design and implement Magneton, an energy profiler that compares energy consumption between similar ML systems at the operator level and automatically pinpoints code regions and configuration choices responsible for excessive energy use.
- Score: 8.58416976020519
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The training and deployment of machine learning (ML) models have become extremely energy-intensive. While existing optimization efforts focus primarily on hardware energy efficiency, a significant but overlooked source of inefficiency is software energy waste caused by poor software design. This often includes redundant or poorly designed operations that consume more energy without improving performance. These inefficiencies arise in widely used ML frameworks and applications, yet developers often lack the visibility and tools to detect and diagnose them. We propose differential energy debugging, a novel approach that leverages the observation that competing ML systems often implement similar functionality with vastly different energy consumption. Building on this insight, we design and implement Magneton, an energy profiler that compares energy consumption between similar ML systems at the operator level and automatically pinpoints code regions and configuration choices responsible for excessive energy use. Applied to 9 popular ML systems spanning LLM inference, general ML frameworks, and image generation, Magneton detects and diagnoses 16 known cases of software energy inefficiency and further discovers 8 previously unknown cases, 7 of which have been confirmed by developers.
Related papers
- Towards Green AI: Decoding the Energy of LLM Inference in Software Development [46.879983975894135]
AI-assisted tools are increasingly integrated into software development, but their reliance on large language models (LLMs) introduces substantial computational and energy costs.<n>We conduct a phase-level analysis of LLM inference energy consumption, distinguishing between the (1) prefill, where the model processes the input and builds internal representations, and (2) decoding, where output tokens are generated using the stored state.
arXiv Detail & Related papers (2026-02-05T14:38:19Z) - Leveraging LLMs to Automate Energy-Aware Refactoring of Parallel Scientific Codes [1.2178992475191555]
Large language models (LLMs) are increasingly used for generating parallel scientific codes.<n>We propose LASSI-EE, an automated screening framework that generates energy-efficient parallel codes.<n>We introduce energy-reduction@k, a novel metric that quantifies expected energy reduction when generating k code candidates.
arXiv Detail & Related papers (2025-05-04T17:05:34Z) - Can We Make Code Green? Understanding Trade-Offs in LLMs vs. Human Code Optimizations [45.243401722182554]
Large language models (LLMs) claim to assist developers in optimizing code for performance and energy efficiency.<n>This work focuses on software written in Matlab-widely used in both academia and industry for scientific and engineering applications.<n>We analyze energy-focused optimization on 400 scripts across 100 top GitHub repositories.
arXiv Detail & Related papers (2025-03-26T00:27:29Z) - AI-Powered, But Power-Hungry? Energy Efficiency of LLM-Generated Code [45.77395425799378]
This paper presents the first study analyzing the energy efficiency and performance of LLM-generated code for three programming languages Python, Java, and C++.<n>Our results show that the models are much more successful in generating Python and Java than C++ code.
arXiv Detail & Related papers (2025-02-04T15:32:34Z) - Unveiling the Energy Vampires: A Methodology for Debugging Software Energy Consumption [5.602876058122268]
This paper presents an energy debug methodology for identifying and isolating energy consumption hotspots in software systems.<n>Our analysis reveals significant energy consumption differences between Alpine and Ubuntu distributions.<n>By isolating and benchmarking memcpy, we confirm it as the primary cause of the energy discrepancy.
arXiv Detail & Related papers (2024-12-13T11:49:19Z) - Can Large-Language Models Help us Better Understand and Teach the Development of Energy-Efficient Software? [2.8812501020074968]
Energy-efficient software engineering techniques are often absent from undergraduate curricula.
We propose to develop a learning module for energy-efficient software, suitable for incorporation into an undergraduate software engineering class.
arXiv Detail & Related papers (2024-10-30T01:09:32Z) - Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design [59.00758127310582]
We propose a novel framework Read-ME that transforms pre-trained dense LLMs into smaller MoE models.
Our approach employs activation sparsity to extract experts.
Read-ME outperforms other popular open-source dense models of similar scales.
arXiv Detail & Related papers (2024-10-24T19:48:51Z) - MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI [5.50579824344998]
Machine learning (ML) technologies have led to a surge in power consumption across diverse systems.<n>This paper introduceserf Power, a comprehensive benchmarking methodology to evaluate the energy efficiency of ML systems at power levels ranging from microwatts to megawatts.
arXiv Detail & Related papers (2024-10-15T20:06:33Z) - Impact of ML Optimization Tactics on Greener Pre-Trained ML Models [46.78148962732881]
This study aims to (i) analyze image classification datasets and pre-trained models, (ii) improve inference efficiency by comparing optimized and non-optimized models, and (iii) assess the economic impact of the optimizations.
We conduct a controlled experiment to evaluate the impact of various PyTorch optimization techniques (dynamic quantization, torch.compile, local pruning, and global pruning) to 42 Hugging Face models for image classification.
Dynamic quantization demonstrates significant reductions in inference time and energy consumption, making it highly suitable for large-scale systems.
arXiv Detail & Related papers (2024-09-19T16:23:03Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Toward Cross-Layer Energy Optimizations in AI Systems [4.871463967255196]
Energy efficiency is likely to become the gating factor toward adoption of artificial intelligence.
With the pervasive usage of artificial intelligence (AI) and machine learning (ML) tools and techniques, their energy efficiency is likely to become the gating factor toward adoption.
This is because generative AI (GenAI) models are massive energy hogs.
Inference consumes even more energy, because a model trained once serve millions.
arXiv Detail & Related papers (2024-04-10T01:35:17Z) - Multiagent Reinforcement Learning with an Attention Mechanism for
Improving Energy Efficiency in LoRa Networks [52.96907334080273]
As the network scale increases, the energy efficiency of LoRa networks decreases sharply due to severe packet collisions.
We propose a transmission parameter allocation algorithm based on multiagent reinforcement learning (MALoRa)
Simulation results demonstrate that MALoRa significantly improves the system EE compared with baseline algorithms.
arXiv Detail & Related papers (2023-09-16T11:37:23Z) - FELARE: Fair Scheduling of Machine Learning Applications on
Heterogeneous Edge Systems [5.165692107696155]
Edge computing enables smart IoT-based systems via concurrent and continuous execution of latency-sensitive machine learning (ML) applications.
We study and analyze resource allocation solutions that can increase the on-time task completion rate while considering the energy constraint.
We observed 8.9% improvement in on-time task completion rate and 12.6% in energy-saving without imposing any significant overhead on the edge system.
arXiv Detail & Related papers (2022-05-31T19:19:40Z) - Automated Machine Learning: A Case Study on Non-Intrusive Appliance Load Monitoring [81.06807079998117]
We propose a novel approach to enable Automated Machine Learning (AutoML) for Non-Intrusive Appliance Load Monitoring (NIALM)<n>NIALM offers a cost-effective alternative to smart meters for measuring the energy consumption of electric devices and appliances.
arXiv Detail & Related papers (2022-03-06T10:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.