Related papers: AIMeter: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI Workloads

AIMeter: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI Workloads

URL: http://arxiv.org/abs/2506.20535v2
Date: Thu, 30 Oct 2025 10:14:59 GMT
Title: AIMeter: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI Workloads
Authors: Hongzhen Huang, Kunming Zhang, Hanlong Liao, Kui Wu, Guoming Tang,
Abstract summary: AIMeter is a comprehensive software toolkit for the measurement, analysis, and visualization of energy use, power draw, hardware performance, and carbon emissions across AI workloads.<n>By seamlessly integrating with existing AI frameworks, AIMeter offers standardized reports and exports fine-grained time-series data.<n>It further enables in-depth correlation analysis between hardware metrics and model performance and thus facilitates bottleneck identification and performance enhancement.
Score: 7.7878942091873755
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid advancement of AI, particularly large language models (LLMs), has raised significant concerns about the energy use and carbon emissions associated with model training and inference. However, existing tools for measuring and reporting such impacts are often fragmented, lacking systematic metric integration and offering limited support for correlation analysis among them. This paper presents AIMeter, a comprehensive software toolkit for the measurement, analysis, and visualization of energy use, power draw, hardware performance, and carbon emissions across AI workloads. By seamlessly integrating with existing AI frameworks, AIMeter offers standardized reports and exports fine-grained time-series data to support benchmarking and reproducibility in a lightweight manner. It further enables in-depth correlation analysis between hardware metrics and model performance and thus facilitates bottleneck identification and performance enhancement. By addressing critical limitations in existing tools, AIMeter encourages the research community to weigh environmental impact alongside raw performance of AI workloads and advances the shift toward more sustainable "Green AI" practices. The code is available at https://github.com/SusCom-Lab/AIMeter.

Related papers

AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models [2.7946918847372277]
We propose AI-CARE, an evaluation tool for reporting energy consumption, and carbon emissions of machine learning models.<n>We demonstrate, through theoretical analysis and empirical validation, that carbon-aware benchmarking changes the relative ranking of models.<n>Our proposal aims to shift the research community toward transparent, multi-objective evaluation and align ML progress with global sustainability goals.
arXiv Detail & Related papers (2026-02-17T21:52:48Z)
TokaMark: A Comprehensive Benchmark for MAST Tokamak Plasma Models [56.94569090844015]
TokaMark is a structured benchmark to evaluate AI models on real experimental data collected from the Mega Ampere Spherical Tokamak (MAST)<n>TokaMark aims to accelerate progress in data-driven AI-based plasma modeling, contributing to the broader goal of achieving sustainable and stable fusion energy.
arXiv Detail & Related papers (2026-02-05T16:49:44Z)
ML-EcoLyzer: Quantifying the Environmental Cost of Machine Learning Inference Across Frameworks and Hardware [0.0]
We present ML-EcoLyzer, a tool for measuring the carbon, energy, thermal, and water costs of machine learning inference.<n>The tool supports both classical and modern models, applying adaptive monitoring and hardware-aware evaluation.
arXiv Detail & Related papers (2025-11-10T04:30:29Z)
Metrics and evaluations for computational and sustainable AI efficiency [26.52588349722099]
Current approaches fail to provide a holistic view, making it difficult to compare and optimise systems.<n>We propose a unified and reproducible methodology for AI model inference that integrates computational and environmental metrics.<n>Our framework provides pragmatic, carbon-aware evaluation by systematically measuring latency and distributions throughput, energy consumption, and location-adjusted carbon emissions.
arXiv Detail & Related papers (2025-10-18T03:30:15Z)
Ground-Truthing AI Energy Consumption: Validating CodeCarbon Against External Measurements [2.538209532048867]
This study systematically evaluates the reliability of static and dynamic energy estimation approaches.<n>The established estimation approaches are shown to consistently make errors of up to 40%.<n>This study establishes transparency and validates widely used tools for sustainable AI development.
arXiv Detail & Related papers (2025-09-26T09:12:21Z)
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments [70.42705564227548]
We propose an automated environment construction pipeline for large language models (LLMs)<n>This enables the creation of high-quality training environments that provide detailed and measurable feedback without relying on external tools.<n>We also introduce a verifiable reward mechanism that evaluates both the precision of tool use and the completeness of task execution.
arXiv Detail & Related papers (2025-08-12T09:45:19Z)
Calculating Software's Energy Use and Carbon Emissions: A Survey of the State of Art, Challenges, and the Way Ahead [8.377809633825196]
The proliferation of software and AI comes with a hidden risk: its growing energy and carbon footprint.<n>We present a state-of-the-art review of methods and tools that enable the measurement of software and AI-related energy and/or carbon emissions.
arXiv Detail & Related papers (2025-06-11T13:02:00Z)
Breaking the ICE: Exploring promises and challenges of benchmarks for Inference Carbon & Energy estimation for LLMs [8.377809633825196]
We discuss the challenges of current approaches and present our evolving framework, R-ICE, which estimates prompt level inference carbon emissions.<n>Our promising validation results suggest that benchmark-based modelling holds great potential for inference emission estimation.
arXiv Detail & Related papers (2025-06-10T12:23:02Z)
Green MLOps to Green GenOps: An Empirical Study of Energy Consumption in Discriminative and Generative AI Operations [2.2765705959685234]
This study investigates the energy consumption of Discriminative and Generative AI models within real-world MLOps pipelines.<n>We employ software-based power measurements to ensure ease of replication across diverse configurations, models, and datasets.
arXiv Detail & Related papers (2025-03-31T10:28:04Z)
General Scales Unlock AI Evaluation with Explanatory and Predictive Power [57.7995945974989]
benchmarking has guided progress in AI, but it has offered limited explanatory and predictive power for general-purpose AI systems.<n>We introduce general scales for AI evaluation that can explain what common AI benchmarks really measure.<n>Our fully-automated methodology builds on 18 newly-crafted rubrics that place instance demands on general scales that do not saturate.
arXiv Detail & Related papers (2025-03-09T01:13:56Z)
A Beginner's Guide to Power and Energy Measurement and Estimation for Computing and Machine Learning [0.5224038339798622]
This paper introduces the main considerations necessary for making sound use of energy measurement tools.<n>It includes the use of at-the-wall versus on-device measurements, sampling strategies and best practices, common sources of error, and proxy measures.<n>It concludes with a call to action for improving the state of the art of measurement methods.
arXiv Detail & Related papers (2024-12-11T19:00:00Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Computing Within Limits: An Empirical Study of Energy Consumption in ML Training and Inference [2.553456266022126]
Machine learning (ML) has seen tremendous advancements, but its environmental footprint remains a concern. Acknowledging the growing environmental impact of ML this paper investigates Green ML.
arXiv Detail & Related papers (2024-06-20T13:59:34Z)
QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement. QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights. We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z)
On the Opportunities of Green Computing: A Survey [80.21955522431168]
Artificial Intelligence (AI) has achieved significant advancements in technology and research with the development over several decades. The needs for high computing power brings higher carbon emission and undermines research fairness. To tackle the challenges of computing resources and environmental impact of AI, Green Computing has become a hot research topic.
arXiv Detail & Related papers (2023-11-01T11:16:41Z)
Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation [82.85015548989223]
Pentathlon is a benchmark for holistic and realistic evaluation of model efficiency. Pentathlon focuses on inference, which accounts for a majority of the compute in a model's lifecycle. It incorporates a suite of metrics that target different aspects of efficiency, including latency, throughput, memory overhead, and energy consumption.
arXiv Detail & Related papers (2023-07-19T01:05:33Z)
A Comparative Study of Machine Learning Algorithms for Anomaly Detection in Industrial Environments: Performance and Environmental Impact [62.997667081978825]
This study seeks to address the demands of high-performance machine learning models with environmental sustainability. Traditional machine learning algorithms, such as Decision Trees and Random Forests, demonstrate robust efficiency and performance. However, superior outcomes were obtained with optimised configurations, albeit with a commensurate increase in resource consumption.
arXiv Detail & Related papers (2023-07-01T15:18:00Z)
Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks [2.0315147707806283]
Mystique is an accurate and scalable framework for production AI benchmark generation. Mystique is scalable, due to its lightweight data collection, in terms of overhead runtime and instrumentation effort. We evaluate our methodology on several production AI models, and show that benchmarks generated with Mystique closely resemble original AI models.
arXiv Detail & Related papers (2022-12-16T18:46:37Z)
Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review [62.997667081978825]
This review aims at providing a comprehensive vision of the main state-of-the-art libraries and frameworks for machine learning and data analytics available today. The main simulation, emulation, deployment systems, and testbeds for experimental research on the Edge-to-Cloud Continuum available today are also surveyed.
arXiv Detail & Related papers (2022-04-29T08:06:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.