Related papers: Towards a Benchmark for Scientific Understanding in Humans and Machines

Towards a Benchmark for Scientific Understanding in Humans and Machines

URL: http://arxiv.org/abs/2304.10327v2
Date: Fri, 21 Apr 2023 08:57:06 GMT
Title: Towards a Benchmark for Scientific Understanding in Humans and Machines
Authors: Kristian Gonzalez Barman, Sascha Caron, Tom Claassen, Henk de Regt
Abstract summary: We propose a framework to create a benchmark for scientific understanding, utilizing tools from philosophy of science. We adopt a behavioral notion according to which genuine understanding should be recognized as an ability to perform certain tasks.
Score: 2.714583452862024
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Scientific understanding is a fundamental goal of science, allowing us to explain the world. There is currently no good way to measure the scientific understanding of agents, whether these be humans or Artificial Intelligence systems. Without a clear benchmark, it is challenging to evaluate and compare different levels of and approaches to scientific understanding. In this Roadmap, we propose a framework to create a benchmark for scientific understanding, utilizing tools from philosophy of science. We adopt a behavioral notion according to which genuine understanding should be recognized as an ability to perform certain tasks. We extend this notion by considering a set of questions that can gauge different levels of scientific understanding, covering information retrieval, the capability to arrange information to produce an explanation, and the ability to infer how things would be different under different circumstances. The Scientific Understanding Benchmark (SUB), which is formed by a set of these tests, allows for the evaluation and comparison of different approaches. Benchmarking plays a crucial role in establishing trust, ensuring quality control, and providing a basis for performance evaluation. By aligning machine and human scientific understanding we can improve their utility, ultimately advancing scientific understanding and helping to discover new insights within machines.

Related papers

Interpretable Machine Learning in Physics: A Review [10.77934040629518]
We aim to establish interpretable machine learning as a core research focus in science. We categorize different aspects of interpretability, discuss machine learning models in terms of both interpretability and performance. We highlight recent advances in interpretable machine learning across many subfields of physics.
arXiv Detail & Related papers (2025-03-30T22:44:40Z)
Scaling Laws in Scientific Discovery with AI and Robot Scientists [72.3420699173245]
An autonomous generalist scientist (AGS) concept combines agentic AI and embodied robotics to automate the entire research lifecycle. AGS aims to significantly reduce the time and resources needed for scientific discovery. As these autonomous systems become increasingly integrated into the research process, we hypothesize that scientific discovery might adhere to new scaling laws.
arXiv Detail & Related papers (2025-03-28T14:00:27Z)
Open Problems in Mechanistic Interpretability [61.44773053835185]
Mechanistic interpretability aims to understand the computational mechanisms underlying neural networks' capabilities. Despite recent progress toward these goals, there are many open problems in the field that require solutions.
arXiv Detail & Related papers (2025-01-27T20:57:18Z)
Explain the Black Box for the Sake of Science: the Scientific Method in the Era of Generative Artificial Intelligence [0.9065034043031668]
The scientific method is the cornerstone of human progress across all branches of the natural and applied sciences. We argue that human complex reasoning for scientific discovery remains of vital importance, at least before the advent of artificial general intelligence. Knowing what data AI systems deemed important to make decisions can be a point of contact with domain experts and scientists.
arXiv Detail & Related papers (2024-06-15T08:34:42Z)
Diverse Explanations From Data-Driven and Domain-Driven Perspectives in the Physical Sciences [4.442043151145212]
This Perspective explores the sources and implications of diverse explanations in machine learning applications for physical sciences. We examine how different models, explanation methods, levels of feature attribution, and stakeholder needs can result in varying interpretations of ML outputs. Our analysis underscores the importance of considering multiple perspectives when interpreting ML models in scientific contexts.
arXiv Detail & Related papers (2024-02-01T05:28:28Z)
Machine Psychology [54.287802134327485]
We argue that a fruitful direction for research is engaging large language models in behavioral experiments inspired by psychology. We highlight theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table. It paves the way for a "machine psychology" for generative artificial intelligence (AI) that goes beyond performance benchmarks.
arXiv Detail & Related papers (2023-03-24T13:24:41Z)
Language Cognition and Language Computation -- Human and Machine Language Understanding [51.56546543716759]
Language understanding is a key scientific issue in the fields of cognitive and computer science. Can a combination of the disciplines offer new insights for building intelligent language models?
arXiv Detail & Related papers (2023-01-12T02:37:00Z)
Towards Human Cognition Level-based Experiment Design for Counterfactual Explanations (XAI) [68.8204255655161]
The emphasis of XAI research appears to have turned to a more pragmatic explanation approach for better understanding. An extensive area where cognitive science research may substantially influence XAI advancements is evaluating user knowledge and feedback. We propose a framework to experiment with generating and evaluating the explanations on the grounds of different cognitive levels of understanding.
arXiv Detail & Related papers (2022-10-31T19:20:22Z)
Towards Benchmarking Explainable Artificial Intelligence Methods [0.0]
We use philosophy of science theories as an analytical lens with the goal of revealing, what can be expected, and more importantly, not expected, from methods that aim to explain decisions promoted by a neural network. By conducting a case study we investigate a selection of explainability method's performance over two mundane domains, animals and headgear. We lay bare that the usefulness of these methods relies on human domain knowledge and our ability to understand, generalise and reason.
arXiv Detail & Related papers (2022-08-25T14:28:30Z)
Satellite Image and Machine Learning based Knowledge Extraction in the Poverty and Welfare Domain [0.0]
We review the literature focusing on three core elements relevant in this context: transparency, interpretability, and explainability. We argue that explainability is essential to support wider dissemination and acceptance of this research.
arXiv Detail & Related papers (2022-03-02T12:38:20Z)
Active Inference in Robotics and Artificial Agents: Survey and Challenges [51.29077770446286]
We review the state-of-the-art theory and implementations of active inference for state-estimation, control, planning and learning. We showcase relevant experiments that illustrate its potential in terms of adaptation, generalization and robustness.
arXiv Detail & Related papers (2021-12-03T12:10:26Z)
AGENT: A Benchmark for Core Psychological Reasoning [60.35621718321559]
Intuitive psychology is the ability to reason about hidden mental variables that drive observable actions. Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning. We present a benchmark consisting of procedurally generated 3D animations, AGENT, structured around four scenarios.
arXiv Detail & Related papers (2021-02-24T14:58:23Z)
Understanding understanding: a renormalization group inspired model of (artificial) intelligence [0.0]
This paper is about the meaning of understanding in scientific and in artificial intelligent systems. We give a mathematical definition of the understanding, where, contrary to the common wisdom, we define the probability space on the input set. We show, how scientific understanding fits into this framework, and demonstrate, what is the difference between a scientific task and pattern recognition.
arXiv Detail & Related papers (2020-10-26T11:11:46Z)
A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented. This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.