Towards a Benchmark for Scientific Understanding in Humans and Machines
- URL: http://arxiv.org/abs/2304.10327v2
- Date: Fri, 21 Apr 2023 08:57:06 GMT
- Title: Towards a Benchmark for Scientific Understanding in Humans and Machines
- Authors: Kristian Gonzalez Barman, Sascha Caron, Tom Claassen, Henk de Regt
- Abstract summary: We propose a framework to create a benchmark for scientific understanding, utilizing tools from philosophy of science.
We adopt a behavioral notion according to which genuine understanding should be recognized as an ability to perform certain tasks.
- Score: 2.714583452862024
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Scientific understanding is a fundamental goal of science, allowing us to
explain the world. There is currently no good way to measure the scientific
understanding of agents, whether these be humans or Artificial Intelligence
systems. Without a clear benchmark, it is challenging to evaluate and compare
different levels of and approaches to scientific understanding. In this
Roadmap, we propose a framework to create a benchmark for scientific
understanding, utilizing tools from philosophy of science. We adopt a
behavioral notion according to which genuine understanding should be recognized
as an ability to perform certain tasks. We extend this notion by considering a
set of questions that can gauge different levels of scientific understanding,
covering information retrieval, the capability to arrange information to
produce an explanation, and the ability to infer how things would be different
under different circumstances. The Scientific Understanding Benchmark (SUB),
which is formed by a set of these tests, allows for the evaluation and
comparison of different approaches. Benchmarking plays a crucial role in
establishing trust, ensuring quality control, and providing a basis for
performance evaluation. By aligning machine and human scientific understanding
we can improve their utility, ultimately advancing scientific understanding and
helping to discover new insights within machines.
Related papers
- Explain the Black Box for the Sake of Science: the Scientific Method in the Era of Generative Artificial Intelligence [0.9065034043031668]
The scientific method is the cornerstone of human progress across all branches of the natural and applied sciences.
We argue that human complex reasoning for scientific discovery remains of vital importance, at least before the advent of artificial general intelligence.
Knowing what data AI systems deemed important to make decisions can be a point of contact with domain experts and scientists.
arXiv Detail & Related papers (2024-06-15T08:34:42Z) - Diverse Explanations From Data-Driven and Domain-Driven Perspectives in the Physical Sciences [4.442043151145212]
This Perspective explores the sources and implications of diverse explanations in machine learning applications for physical sciences.
We examine how different models, explanation methods, levels of feature attribution, and stakeholder needs can result in varying interpretations of ML outputs.
Our analysis underscores the importance of considering multiple perspectives when interpreting ML models in scientific contexts.
arXiv Detail & Related papers (2024-02-01T05:28:28Z) - Machine Psychology [54.287802134327485]
We argue that a fruitful direction for research is engaging large language models in behavioral experiments inspired by psychology.
We highlight theoretical perspectives, experimental paradigms, and computational analysis techniques that this approach brings to the table.
It paves the way for a "machine psychology" for generative artificial intelligence (AI) that goes beyond performance benchmarks.
arXiv Detail & Related papers (2023-03-24T13:24:41Z) - Language Cognition and Language Computation -- Human and Machine
Language Understanding [51.56546543716759]
Language understanding is a key scientific issue in the fields of cognitive and computer science.
Can a combination of the disciplines offer new insights for building intelligent language models?
arXiv Detail & Related papers (2023-01-12T02:37:00Z) - Towards Human Cognition Level-based Experiment Design for Counterfactual
Explanations (XAI) [68.8204255655161]
The emphasis of XAI research appears to have turned to a more pragmatic explanation approach for better understanding.
An extensive area where cognitive science research may substantially influence XAI advancements is evaluating user knowledge and feedback.
We propose a framework to experiment with generating and evaluating the explanations on the grounds of different cognitive levels of understanding.
arXiv Detail & Related papers (2022-10-31T19:20:22Z) - Towards Benchmarking Explainable Artificial Intelligence Methods [0.0]
We use philosophy of science theories as an analytical lens with the goal of revealing, what can be expected, and more importantly, not expected, from methods that aim to explain decisions promoted by a neural network.
By conducting a case study we investigate a selection of explainability method's performance over two mundane domains, animals and headgear.
We lay bare that the usefulness of these methods relies on human domain knowledge and our ability to understand, generalise and reason.
arXiv Detail & Related papers (2022-08-25T14:28:30Z) - Satellite Image and Machine Learning based Knowledge Extraction in the
Poverty and Welfare Domain [0.0]
We review the literature focusing on three core elements relevant in this context: transparency, interpretability, and explainability.
We argue that explainability is essential to support wider dissemination and acceptance of this research.
arXiv Detail & Related papers (2022-03-02T12:38:20Z) - Active Inference in Robotics and Artificial Agents: Survey and
Challenges [51.29077770446286]
We review the state-of-the-art theory and implementations of active inference for state-estimation, control, planning and learning.
We showcase relevant experiments that illustrate its potential in terms of adaptation, generalization and robustness.
arXiv Detail & Related papers (2021-12-03T12:10:26Z) - AGENT: A Benchmark for Core Psychological Reasoning [60.35621718321559]
Intuitive psychology is the ability to reason about hidden mental variables that drive observable actions.
Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning.
We present a benchmark consisting of procedurally generated 3D animations, AGENT, structured around four scenarios.
arXiv Detail & Related papers (2021-02-24T14:58:23Z) - Understanding understanding: a renormalization group inspired model of
(artificial) intelligence [0.0]
This paper is about the meaning of understanding in scientific and in artificial intelligent systems.
We give a mathematical definition of the understanding, where, contrary to the common wisdom, we define the probability space on the input set.
We show, how scientific understanding fits into this framework, and demonstrate, what is the difference between a scientific task and pattern recognition.
arXiv Detail & Related papers (2020-10-26T11:11:46Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.