Related papers: Towards Sustainability Model Cards

Towards Sustainability Model Cards

URL: http://arxiv.org/abs/2507.19559v1
Date: Fri, 25 Jul 2025 08:26:53 GMT
Title: Towards Sustainability Model Cards
Authors: Gwendal Jouneaux, Jordi Cabot,
Abstract summary: We propose a new Domain-Specific Language to define the sustainability aspects of an ML model.<n>This information can then be exported as an extended version of the well-known Model Cards initiative.
Score: 1.961305559606562
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The growth of machine learning (ML) models and associated datasets triggers a consequent dramatic increase in energy costs for the use and training of these models. In the current context of environmental awareness and global sustainability concerns involving ICT, Green AI is becoming an important research topic. Initiatives like the AI Energy Score Ratings are a good example. Nevertheless, these benchmarking attempts are still to be integrated with existing work on Quality Models and Service-Level Agreements common in other, more mature, ICT subfields. This limits the (automatic) analysis of this model energy descriptions and their use in (semi)automatic model comparison, selection, and certification processes. We aim to leverage the concept of quality models and merge it with existing ML model reporting initiatives and Green/Frugal AI proposals to formalize a Sustainable Quality Model for AI/ML models. As a first step, we propose a new Domain-Specific Language to precisely define the sustainability aspects of an ML model (including the energy costs for its different tasks). This information can then be exported as an extended version of the well-known Model Cards initiative while, at the same time, being formal enough to be input of any other model description automatic process.

Related papers

Benchmarks as Microscopes: A Call for Model Metrology [76.64402390208576]
Modern language models (LMs) pose a new challenge in capability assessment. To be confident in our metrics, we need a new discipline of model metrology.
arXiv Detail & Related papers (2024-07-22T17:52:12Z)
Science based AI model certification for new operational environments with application in traffic state estimation [1.2186759689780324]
The expanding role of Artificial Intelligence (AI) in diverse engineering domains highlights the challenges associated with deploying AI models in new operational environments. This paper proposes a science-based certification methodology to assess the viability of employing pre-trained data-driven models in new operational environments.
arXiv Detail & Related papers (2024-05-13T16:28:00Z)
AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios. We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z)
Energy-based Automated Model Evaluation [19.90797626200033]
We propose a novel measure -- Meta-Distribution Energy (MDE) -- that allows the AutoEval framework to be both more efficient and effective. MDE establishes a meta-distribution statistic, on the information (energy) associated with individual samples, then offer a smoother representation enabled by energy-based learning. We provide extensive experiments across modalities, datasets and different architectural backbones to validate MDE's validity.
arXiv Detail & Related papers (2024-01-23T11:54:09Z)
Power Hungry Processing: Watts Driving the Cost of AI Deployment? [74.19749699665216]
generative, multi-purpose AI systems promise a unified approach to building machine learning (ML) models into technology. This ambition of generality'' comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit. We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models. We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions
arXiv Detail & Related papers (2023-11-28T15:09:36Z)
QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement. QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights. We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z)
Stable and Interpretable Deep Learning for Tabular Data: Introducing InterpreTabNet with the Novel InterpreStability Metric [4.362293468843233]
We introduce InterpreTabNet, a model designed to enhance both classification accuracy and interpretability. We also present a novel evaluation metric, InterpreStability, which quantifies the stability of a model's interpretability.
arXiv Detail & Related papers (2023-10-04T15:04:13Z)
Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost. Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z)
Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy. We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.