Modulation of viability signals for self-regulatory control
- URL: http://arxiv.org/abs/2007.09297v2
- Date: Tue, 13 Oct 2020 11:57:40 GMT
- Title: Modulation of viability signals for self-regulatory control
- Authors: Alvaro Ovalle and Simon M. Lucas
- Abstract summary: We revisit the role of instrumental value as a driver of adaptive behavior.
For reinforcement learning tasks, the distribution of preferences replaces the notion of reward.
- Score: 1.370633147306388
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We revisit the role of instrumental value as a driver of adaptive behavior.
In active inference, instrumental or extrinsic value is quantified by the
information-theoretic surprisal of a set of observations measuring the extent
to which those observations conform to prior beliefs or preferences. That is,
an agent is expected to seek the type of evidence that is consistent with its
own model of the world. For reinforcement learning tasks, the distribution of
preferences replaces the notion of reward. We explore a scenario in which the
agent learns this distribution in a self-supervised manner. In particular, we
highlight the distinction between observations induced by the environment and
those pertaining more directly to the continuity of an agent in time. We
evaluate our methodology in a dynamic environment with discrete time and
actions. First with a surprisal minimizing model-free agent (in the RL sense)
and then expanding to the model-based case to minimize the expected free
energy.
Related papers
- Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Hierarchical Imitation Learning for Stochastic Environments [31.64016324441371]
Existing methods that improve distributional realism typically rely on hierarchical policies.
We propose Robust Type Conditioning (RTC), which eliminates the shift with adversarial training under environmentality.
Experiments on two domains, including the large-scale Open Motion dataset, show improved distributional realism while maintaining or improving task performance compared to state-of-the-art baselines.
arXiv Detail & Related papers (2023-09-25T10:10:34Z) - A Neural Active Inference Model of Perceptual-Motor Learning [62.39667564455059]
The active inference framework (AIF) is a promising new computational framework grounded in contemporary neuroscience.
In this study, we test the ability for the AIF to capture the role of anticipation in the visual guidance of action in humans.
We present a novel formulation of the prior function that maps a multi-dimensional world-state to a uni-dimensional distribution of free-energy.
arXiv Detail & Related papers (2022-11-16T20:00:38Z) - Control-Aware Prediction Objectives for Autonomous Driving [78.19515972466063]
We present control-aware prediction objectives (CAPOs) to evaluate the downstream effect of predictions on control without requiring the planner be differentiable.
We propose two types of importance weights that weight the predictive likelihood: one using an attention model between agents, and another based on control variation when exchanging predicted trajectories for ground truth trajectories.
arXiv Detail & Related papers (2022-04-28T07:37:21Z) - Differential Assessment of Black-Box AI Agents [29.98710357871698]
We propose a novel approach to differentially assess black-box AI agents that have drifted from their previously known models.
We leverage sparse observations of the drifted agent's current behavior and knowledge of its initial model to generate an active querying policy.
Empirical evaluation shows that our approach is much more efficient than re-learning the agent model from scratch.
arXiv Detail & Related papers (2022-03-24T17:48:58Z) - Information is Power: Intrinsic Control via Information Capture [110.3143711650806]
We argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
This objective induces an agent to both gather information about its environment, corresponding to reducing uncertainty, and to gain control over its environment, corresponding to reducing the unpredictability of future world states.
arXiv Detail & Related papers (2021-12-07T18:50:42Z) - Deceptive Decision-Making Under Uncertainty [25.197098169762356]
We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks.
By modeling the agent's behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals.
We propose a novel approach to model observer predictions based on the principle of maximum entropy and to efficiently generate deceptive strategies.
arXiv Detail & Related papers (2021-09-14T14:56:23Z) - Continuous Homeostatic Reinforcement Learning for Self-Regulated
Autonomous Agents [0.0]
We propose an extension of the homeostatic reinforcement learning theory to a continuous environment in space and time.
Inspired by the self-regulating mechanisms abundantly present in biology, we also introduce a model for the dynamics of the agent internal state.
arXiv Detail & Related papers (2021-09-14T11:03:58Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Maximizing Information Gain in Partially Observable Environments via
Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.