The Reasons that Agents Act: Intention and Instrumental Goals
- URL: http://arxiv.org/abs/2402.07221v2
- Date: Thu, 15 Feb 2024 11:45:37 GMT
- Title: The Reasons that Agents Act: Intention and Instrumental Goals
- Authors: Francis Rhys Ward and Matt MacDermott and Francesco Belardinelli and
Francesca Toni and Tom Everitt
- Abstract summary: There is no universally accepted theory of intention applicable to AI agents.
We operationalise the intention with which an agent acts, relating to the reasons it chooses its decision.
Our definition captures the intuitive notion of intent and satisfies desiderata set-out by past work.
- Score: 24.607124467778036
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Intention is an important and challenging concept in AI. It is important
because it underlies many other concepts we care about, such as agency,
manipulation, legal responsibility, and blame. However, ascribing intent to AI
systems is contentious, and there is no universally accepted theory of
intention applicable to AI agents. We operationalise the intention with which
an agent acts, relating to the reasons it chooses its decision. We introduce a
formal definition of intention in structural causal influence models, grounded
in the philosophy literature on intent and applicable to real-world machine
learning systems. Through a number of examples and results, we show that our
definition captures the intuitive notion of intent and satisfies desiderata
set-out by past work. In addition, we show how our definition relates to past
concepts, including actual causality, and the notion of instrumental goals,
which is a core idea in the literature on safe AI agents. Finally, we
demonstrate how our definition can be used to infer the intentions of
reinforcement learning agents and language models from their behaviour.
Related papers
- Using AI Alignment Theory to understand the potential pitfalls of regulatory frameworks [55.2480439325792]
This paper critically examines the European Union's Artificial Intelligence Act (EU AI Act)
Uses insights from Alignment Theory (AT) research, which focuses on the potential pitfalls of technical alignment in Artificial Intelligence.
As we apply these concepts to the EU AI Act, we uncover potential vulnerabilities and areas for improvement in the regulation.
arXiv Detail & Related papers (2024-10-10T17:38:38Z) - Position Paper: Agent AI Towards a Holistic Intelligence [53.35971598180146]
We emphasize developing Agent AI -- an embodied system that integrates large foundation models into agent actions.
In this paper, we propose a novel large action model to achieve embodied intelligent behavior, the Agent Foundation Model.
arXiv Detail & Related papers (2024-02-28T16:09:56Z) - Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning [50.47568731994238]
Key method for creating Artificial Intelligence (AI) agents is Reinforcement Learning (RL)
This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.
arXiv Detail & Related papers (2023-12-22T17:57:57Z) - Honesty Is the Best Policy: Defining and Mitigating AI Deception [26.267047631872366]
We focus on the problem that agents might deceive in order to achieve their goals.
We introduce a formal definition of deception in structural causal games.
We show, experimentally, that these results can be used to mitigate deception in reinforcement learning agents and language models.
arXiv Detail & Related papers (2023-12-03T11:11:57Z) - Sensible AI: Re-imagining Interpretability and Explainability using
Sensemaking Theory [14.35488479818285]
We propose an alternate framework for interpretability grounded in Weick's sensemaking theory.
We use an application of sensemaking in organizations as a template for discussing design guidelines for Sensible AI.
arXiv Detail & Related papers (2022-05-10T17:20:44Z) - Cybertrust: From Explainable to Actionable and Interpretable AI (AI2) [58.981120701284816]
Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations.
It will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making.
arXiv Detail & Related papers (2022-01-26T18:53:09Z) - Automated Machine Learning, Bounded Rationality, and Rational
Metareasoning [62.997667081978825]
We will look at automated machine learning (AutoML) and related problems from the perspective of bounded rationality.
Taking actions under bounded resources requires an agent to reflect on how to use these resources in an optimal way.
arXiv Detail & Related papers (2021-09-10T09:10:20Z) - Intensional Artificial Intelligence: From Symbol Emergence to
Explainable and Empathetic AI [0.0]
We argue that an explainable artificial intelligence must possess a rationale for its decisions, be able to infer the purpose of observed behaviour, and be able to explain its decisions in the context of what its audience understands and intends.
To communicate that rationale requires natural language, a means of encoding and decoding perceptual states.
We propose a theory of meaning in which, to acquire language, an agent should model the world a language describes rather than the language itself.
arXiv Detail & Related papers (2021-04-23T13:13:46Z) - Argumentation-based Agents that Explain their Decisions [0.0]
We focus on how an extended model of BDI (Beliefs-Desires-Intentions) agents can be able to generate explanations about their reasoning.
Our proposal is based on argumentation theory, we use arguments to represent the reasons that lead an agent to make a decision.
We propose two types of explanations: the partial one and the complete one.
arXiv Detail & Related papers (2020-09-13T02:08:10Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z) - The Pragmatic Turn in Explainable Artificial Intelligence (XAI) [0.0]
I argue that the search for explainable models and interpretable decisions in AI must be reformulated in terms of the broader project of offering a pragmatic and naturalistic account of understanding in AI.
I conclude that interpretative or approximation models not only provide the best way to achieve the objectual understanding of a machine learning model, but are also a necessary condition to achieve post-hoc interpretability.
arXiv Detail & Related papers (2020-02-22T01:40:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.