Criteria for Credible AI-assisted Carbon Footprinting Systems: The Cases of Mapping and Lifecycle Modeling
- URL: http://arxiv.org/abs/2509.00240v1
- Date: Fri, 29 Aug 2025 21:05:19 GMT
- Title: Criteria for Credible AI-assisted Carbon Footprinting Systems: The Cases of Mapping and Lifecycle Modeling
- Authors: Shaena Ulissi, Andrew Dumit, P. James Joyce, Krishna Rao, Steven Watson, Sangwon Suh,
- Abstract summary: We present a set of criteria to validate AI-assisted systems that calculate greenhouse gas (GHG) emissions for products and materials.<n>This approach may be used as a foundation for practitioners, auditors, and standards bodies to evaluate AI-assisted environmental assessment tools.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: As organizations face increasing pressure to understand their corporate and products' carbon footprints, artificial intelligence (AI)-assisted calculation systems for footprinting are proliferating, but with widely varying levels of rigor and transparency. Standards and guidance have not kept pace with the technology; evaluation datasets are nascent; and statistical approaches to uncertainty analysis are not yet practical to apply to scaled systems. We present a set of criteria to validate AI-assisted systems that calculate greenhouse gas (GHG) emissions for products and materials. We implement a three-step approach: (1) Identification of needs and constraints, (2) Draft criteria development and (3) Refinements through pilots. The process identifies three use cases of AI applications: Case 1 focuses on AI-assisted mapping to existing datasets for corporate GHG accounting and product hotspotting, automating repetitive manual tasks while maintaining mapping quality. Case 2 addresses AI systems that generate complete product models for corporate decision-making, which require comprehensive validation of both component tasks and end-to-end performance. We discuss the outlook for Case 3 applications, systems that generate standards-compliant models. We find that credible AI systems can be built and that they should be validated using system-level evaluations rather than line-item review, with metrics such as benchmark performance, indications of data quality and uncertainty, and transparent documentation. This approach may be used as a foundation for practitioners, auditors, and standards bodies to evaluate AI-assisted environmental assessment tools. By establishing evaluation criteria that balance scalability with credibility requirements, our approach contributes to the field's efforts to develop appropriate standards for AI-assisted carbon footprinting systems.
Related papers
- AI-NativeBench: An Open-Source White-Box Agentic Benchmark Suite for AI-Native Systems [52.65695508605237]
We introduce AI-NativeBench, the first application-centric and white-box AI-Native benchmark suite grounded in Model Context Protocol (MCP) and Agent-to-Agent (A2A) standards.<n>By treating agentic spans as first-class citizens within distributed traces, our methodology enables granular analysis of engineering characteristics beyond simple capabilities.<n>This work provides the first systematic evidence to guide the transition from measuring model capability to engineering reliable AI-Native systems.
arXiv Detail & Related papers (2026-01-14T11:32:07Z) - Self-Certification of High-Risk AI Systems: The Example of AI-based Facial Emotion Recognition [0.0]
The European Union's Artificial Intelligence Act establishes comprehensive requirements for high-risk AI systems.<n>We investigate the practical application of the Fraunhofer AI assessment catalogue as a certification framework.
arXiv Detail & Related papers (2026-01-13T07:34:28Z) - Empowering Real-World: A Survey on the Technology, Practice, and Evaluation of LLM-driven Industry Agents [63.03252293761656]
This paper systematically reviews the technologies, applications, and evaluation methods of industry agents based on large language models (LLMs)<n>We examine the three key technological pillars that support the advancement of agent capabilities: Memory, Planning, and Tool Use.<n>We provide an overview of the application of industry agents in real-world domains such as digital engineering, scientific discovery, embodied intelligence, collaborative business execution, and complex system simulation.
arXiv Detail & Related papers (2025-10-20T12:46:55Z) - Safe and Certifiable AI Systems: Concepts, Challenges, and Lessons Learned [45.44933002008943]
This white paper presents the T"UV AUSTRIA Trusted AI framework.<n>It is an end-to-end audit catalog and methodology for assessing and certifying machine learning systems.<n>Building on three pillars - Secure Software Development, Functional Requirements, and Ethics & Data Privacy - it translates the high-level obligations of the EU AI Act into specific, testable criteria.
arXiv Detail & Related papers (2025-09-08T17:52:08Z) - SpecEval: Evaluating Model Adherence to Behavior Specifications [63.13000010340958]
We introduce an automated framework that audits models against their providers specifications.<n>Our central focus is on three way consistency between a provider specification, its model outputs, and its own models as judges.<n>We apply our framework to 16 models from six developers across more than 100 behavioral statements, finding systematic inconsistencies including compliance gaps of up to 20 percent across providers.
arXiv Detail & Related papers (2025-09-02T16:18:40Z) - Insights Informed Generative AI for Design: Incorporating Real-world Data for Text-to-Image Output [51.88841610098437]
We propose a novel pipeline that integrates DALL-E 3 with a materials dataset to enrich AI-generated designs with sustainability metrics and material usage insights.<n>We evaluate the system through three user tests: (1) no mention of sustainability to the user prior to the prompting process with generative AI, (2) sustainability goals communicated to the user before prompting, and (3) sustainability goals communicated along with quantitative CO2e data included in the generative AI outputs.
arXiv Detail & Related papers (2025-06-17T22:33:11Z) - AutoMind: Adaptive Knowledgeable Agent for Automated Data Science [70.33796196103499]
Large Language Model (LLM) agents have shown great potential in addressing real-world data science problems.<n>Existing frameworks depend on rigid, pre-defined and inflexible coding strategies.<n>We introduce AutoMind, an adaptive, knowledgeable LLM-agent framework.
arXiv Detail & Related papers (2025-06-12T17:59:32Z) - General Scales Unlock AI Evaluation with Explanatory and Predictive Power [57.7995945974989]
benchmarking has guided progress in AI, but it has offered limited explanatory and predictive power for general-purpose AI systems.<n>We introduce general scales for AI evaluation that can explain what common AI benchmarks really measure.<n>Our fully-automated methodology builds on 18 newly-crafted rubrics that place instance demands on general scales that do not saturate.
arXiv Detail & Related papers (2025-03-09T01:13:56Z) - Toward an Evaluation Science for Generative AI Systems [22.733049816407114]
We advocate for maturing an evaluation science for generative AI systems.<n>In particular, we present three key lessons: Evaluation metrics must be applicable to real-world performance, metrics must be iteratively refined, and evaluation institutions and norms must be established.
arXiv Detail & Related papers (2025-03-07T11:23:48Z) - Benchmarks as Microscopes: A Call for Model Metrology [76.64402390208576]
Modern language models (LMs) pose a new challenge in capability assessment.
To be confident in our metrics, we need a new discipline of model metrology.
arXiv Detail & Related papers (2024-07-22T17:52:12Z) - The Foundations of Computational Management: A Systematic Approach to
Task Automation for the Integration of Artificial Intelligence into Existing
Workflows [55.2480439325792]
This article introduces Computational Management, a systematic approach to task automation.
The article offers three easy step-by-step procedures to begin the process of implementing AI within a workflow.
arXiv Detail & Related papers (2024-02-07T01:45:14Z) - Data-Centric Governance [6.85316573653194]
Current AI governance approaches consist mainly of manual review and documentation processes.
Modern AI systems are data-centric: they act on data, produce data, and are built through data engineering.
This work explores the systematization of governance requirements via datasets and algorithmic evaluations.
arXiv Detail & Related papers (2023-02-14T07:22:32Z) - Multisource AI Scorecard Table for System Evaluation [3.74397577716445]
The paper describes a Multisource AI Scorecard Table (MAST) that provides the developer and user of an artificial intelligence (AI)/machine learning (ML) system with a standard checklist.
The paper explores how the analytic tradecraft standards outlined in Intelligence Community Directive (ICD) 203 can provide a framework for assessing the performance of an AI system.
arXiv Detail & Related papers (2021-02-08T03:37:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.