Towards Measurement Theory for Artificial Intelligence
- URL: http://arxiv.org/abs/2507.05587v1
- Date: Tue, 08 Jul 2025 01:52:37 GMT
- Title: Towards Measurement Theory for Artificial Intelligence
- Authors: Elija Perrier,
- Abstract summary: We argue that formalising measurement for AI will allow researchers, practitioners, and regulators to: (i) make comparisons between systems and the evaluation methods applied to them; (ii) connect frontier AI evaluations with established quantitative risk analysis techniques drawn from engineering and safety science; and (iii) foreground how what counts as AI capability is contingent upon the measurement operations and scales we elect to use.<n>We sketch a layered measurement stack, distinguish direct from indirect observables, and signpost how these ingredients provide a pathway toward a unified, calibratable taxonomy of AI phenomena.
- Score: 0.6526824510982799
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We motivate and outline a programme for a formal theory of measurement of artificial intelligence. We argue that formalising measurement for AI will allow researchers, practitioners, and regulators to: (i) make comparisons between systems and the evaluation methods applied to them; (ii) connect frontier AI evaluations with established quantitative risk analysis techniques drawn from engineering and safety science; and (iii) foreground how what counts as AI capability is contingent upon the measurement operations and scales we elect to use. We sketch a layered measurement stack, distinguish direct from indirect observables, and signpost how these ingredients provide a pathway toward a unified, calibratable taxonomy of AI phenomena.
Related papers
- General Scales Unlock AI Evaluation with Explanatory and Predictive Power [57.7995945974989]
benchmarking has guided progress in AI, but it has offered limited explanatory and predictive power for general-purpose AI systems.<n>We introduce general scales for AI evaluation that can explain what common AI benchmarks really measure.<n>Our fully-automated methodology builds on 18 newly-crafted rubrics that place instance demands on general scales that do not saturate.
arXiv Detail & Related papers (2025-03-09T01:13:56Z) - Position: Evaluating Generative AI Systems Is a Social Science Measurement Challenge [78.35388859345056]
We argue that the ML community would benefit from learning from and drawing on the social sciences when developing measurement instruments for evaluating GenAI systems.<n>We present a four-level framework, grounded in measurement theory from the social sciences, for measuring concepts related to the capabilities, behaviors, and impacts of GenAI systems.
arXiv Detail & Related papers (2025-02-01T21:09:51Z) - Evaluating Generative AI Systems is a Social Science Measurement Challenge [78.35388859345056]
We present a framework for measuring concepts related to the capabilities, impacts, opportunities, and risks of GenAI systems.
The framework distinguishes between four levels: the background concept, the systematized concept, the measurement instrument(s), and the instance-level measurements themselves.
arXiv Detail & Related papers (2024-11-17T02:35:30Z) - An Experimental Investigation into the Evaluation of Explainability
Methods [60.54170260771932]
This work compares 14 different metrics when applied to nine state-of-the-art XAI methods and three dummy methods (e.g., random saliency maps) used as references.
Experimental results show which of these metrics produces highly correlated results, indicating potential redundancy.
arXiv Detail & Related papers (2023-05-25T08:07:07Z) - Interpretable Uncertainty Quantification in AI for HEP [2.922388615593672]
Estimating uncertainty is at the core of performing scientific measurements in HEP.
The goal of uncertainty quantification (UQ) is inextricably linked to the question, "how do we physically and statistically interpret these uncertainties?"
For artificial intelligence (AI) applications in HEP, there are several areas where interpretable methods for UQ are essential.
arXiv Detail & Related papers (2022-08-05T17:20:27Z) - An Objective Metric for Explainable AI: How and Why to Estimate the
Degree of Explainability [3.04585143845864]
We present a new model-agnostic metric to measure the Degree of eXplainability of correct information in an objective way.
We designed a few experiments and a user-study on two realistic AI-based systems for healthcare and finance.
arXiv Detail & Related papers (2021-09-11T17:44:13Z) - Measuring Ethics in AI with AI: A Methodology and Dataset Construction [1.6861004263551447]
We propose to use such newfound capabilities of AI technologies to augment our AI measuring capabilities.
We do so by training a model to classify publications related to ethical issues and concerns.
We highlight the implications of AI metrics, in particular their contribution towards developing trustful and fair AI-based tools and technologies.
arXiv Detail & Related papers (2021-07-26T00:26:12Z) - A Comparative Approach to Explainable Artificial Intelligence Methods in
Application to High-Dimensional Electronic Health Records: Examining the
Usability of XAI [0.0]
XAI aims to produce a demonstrative factor of trust, which for human subjects is achieved through communicative means.
The ideology behind trusting a machine to tend towards the livelihood of a human poses an ethical conundrum.
XAI methods produce visualization of the feature contribution towards a given models output on both a local and global level.
arXiv Detail & Related papers (2021-03-08T18:15:52Z) - Neuro-symbolic Architectures for Context Understanding [59.899606495602406]
We propose the use of hybrid AI methodology as a framework for combining the strengths of data-driven and knowledge-driven approaches.
Specifically, we inherit the concept of neuro-symbolism as a way of using knowledge-bases to guide the learning progress of deep neural networks.
arXiv Detail & Related papers (2020-03-09T15:04:07Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.