Related papers: Intention-aware policy graphs: answering what, how, and why in opaque agents

Intention-aware policy graphs: answering what, how, and why in opaque agents

URL: http://arxiv.org/abs/2409.19038v1
Date: Fri, 27 Sep 2024 09:31:45 GMT
Title: Intention-aware policy graphs: answering what, how, and why in opaque agents
Authors: Victor Gimenez-Abalos, Sergio Alvarez-Napagao, Adrian Tormos, Ulises Cortés, Javier Vázquez-Salceda,
Abstract summary: Agents are a special kind of AI-based software in that they interact in complex environments and have increased potential for emergent behaviour. We propose a Probabilistic Graphical Model along with a pipeline for designing such model. We contribute measurements that evaluate the interpretability and reliability of explanations provided. This model can be constructed by taking partial observations of the agent's actions and world states.
Score: 0.1398098625978622
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Agents are a special kind of AI-based software in that they interact in complex environments and have increased potential for emergent behaviour. Explaining such emergent behaviour is key to deploying trustworthy AI, but the increasing complexity and opaque nature of many agent implementations makes this hard. In this work, we propose a Probabilistic Graphical Model along with a pipeline for designing such model -- by which the behaviour of an agent can be deliberated about -- and for computing a robust numerical value for the intentions the agent has at any moment. We contribute measurements that evaluate the interpretability and reliability of explanations provided, and enables explainability questions such as `what do you want to do now?' (e.g. deliver soup) `how do you plan to do it?' (e.g. returning a plan that considers its skills and the world), and `why would you take this action at this state?' (e.g. explaining how that furthers or hinders its own goals). This model can be constructed by taking partial observations of the agent's actions and world states, and we provide an iterative workflow for increasing the proposed measurements through better design and/or pointing out irrational agent behaviour.

Related papers

Graphs Meet AI Agents: Taxonomy, Progress, and Future Opportunities [117.49715661395294]
Data structurization can play a promising role by transforming intricate and disorganized data into well-structured forms.<n>This survey presents a first systematic review of how graphs can empower AI agents.
arXiv Detail & Related papers (2025-06-22T12:59:12Z)
The Limits of Predicting Agents from Behaviour [16.80911584745046]
We provide a precise answer under the assumption that the agent's behaviour is guided by a world model.<n>Our contribution is the derivation of novel bounds on the agent's behaviour in new (unseen) deployment environments.<n>We discuss the implications of these results for several research areas including fairness and safety.
arXiv Detail & Related papers (2025-06-03T14:24:58Z)
Explainable Reinforcement Learning Agents Using World Models [14.489344385644467]
We introduce a technique for using World Models to generate explanations for Model-Based Deep RL agents.<n>World Models predict how the world will change when actions are performed, allowing for the generation of counterfactual trajectories.<n>We show that explanations that show users what the world should have been like significantly increase their understanding of the agent policy.
arXiv Detail & Related papers (2025-05-12T21:18:31Z)
MageBench: Bridging Large Multimodal Models to Agents [90.59091431806793]
LMMs have shown impressive visual understanding capabilities, with the potential to be applied in agents. Existing benchmarks mostly assess their reasoning abilities in language part. MageBench is a reasoning capability oriented multimodal agent benchmark.
arXiv Detail & Related papers (2024-12-05T17:08:19Z)
Aligning Agents like Large Language Models [8.873319874424167]
Training agents to behave as desired in complex 3D environments from high-dimensional sensory information is challenging. We draw an analogy between the undesirable behaviors of imitation learning agents and the unhelpful responses of unaligned large language models (LLMs) We demonstrate that we can align our agent to consistently perform the desired mode, while providing insights and advice for successfully applying this approach to training agents.
arXiv Detail & Related papers (2024-06-06T16:05:45Z)
Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning [50.47568731994238]
Key method for creating Artificial Intelligence (AI) agents is Reinforcement Learning (RL) This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.
arXiv Detail & Related papers (2023-12-22T17:57:57Z)
Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development. To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps. These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z)
Differential Assessment of Black-Box AI Agents [29.98710357871698]
We propose a novel approach to differentially assess black-box AI agents that have drifted from their previously known models. We leverage sparse observations of the drifted agent's current behavior and knowledge of its initial model to generate an active querying policy. Empirical evaluation shows that our approach is much more efficient than re-learning the agent model from scratch.
arXiv Detail & Related papers (2022-03-24T17:48:58Z)
CausalCity: Complex Simulations with Agency for Causal Discovery and Reasoning [68.74447489372037]
We present a high-fidelity simulation environment that is designed for developing algorithms for causal discovery and counterfactual reasoning. A core component of our work is to introduce textitagency, such that it is simple to define and create complex scenarios. We perform experiments with three state-of-the-art methods to create baselines and highlight the affordances of this environment.
arXiv Detail & Related papers (2021-06-25T00:21:41Z)
Learning What To Do by Simulating the Past [76.86449554580291]
We show that by combining a learned feature encoder with learned inverse models, we can enable agents to simulate human actions backwards in time to infer what they must have done. The resulting algorithm is able to reproduce a specific skill in MuJoCo environments given a single state sampled from the optimal policy for that skill.
arXiv Detail & Related papers (2021-04-08T17:43:29Z)
AGI Agent Safety by Iteratively Improving the Utility Function [0.0]
We present an AGI safety layer that creates a special dedicated input terminal to support the iterative improvement of an AGI agent's utility function. We show ongoing work in mapping it to a Causal Influence Diagram (CID) We then present the design of a learning agent, a design that wraps the safety layer around either a known machine learning system, or a potential future AGI-level learning system.
arXiv Detail & Related papers (2020-07-10T14:30:56Z)
What can I do here? A Theory of Affordances in Reinforcement Learning [65.70524105802156]
We develop a theory of affordances for agents who learn and plan in Markov Decision Processes. Affordances play a dual role in this case, by reducing the number of actions available in any given situation. We propose an approach to learn affordances and use it to estimate transition models that are simpler and generalize better.
arXiv Detail & Related papers (2020-06-26T16:34:53Z)
Probing Emergent Semantics in Predictive Agents via Question Answering [29.123837711842995]
Recent work has shown how predictive modeling can endow agents with rich knowledge of their surroundings, improving their ability to act in complex environments. We propose question-answering as a general paradigm to decode and understand the representations that such agents develop the model. We probe their internal state representations with synthetic (English) questions, without backpropagating gradients from the question-answering decoder into the agent.
arXiv Detail & Related papers (2020-06-01T15:27:36Z)
Maximizing Information Gain in Partially Observable Environments via Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent. We derive the exact error between negative entropy and the expected prediction reward. This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.