Goal-Directed Planning by Reinforcement Learning and Active Inference
- URL: http://arxiv.org/abs/2106.09938v2
- Date: Tue, 22 Jun 2021 10:14:01 GMT
- Title: Goal-Directed Planning by Reinforcement Learning and Active Inference
- Authors: Dongqi Han, Kenji Doya and Jun Tani
- Abstract summary: We propose a novel computational framework of decision making with Bayesian inference.
Goal-directed behavior is determined from the posterior distribution of $z$ by planning.
We demonstrate the effectiveness of the proposed framework by experiments in a sensorimotor navigation task with camera observations and continuous motor actions.
- Score: 16.694117274961016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: What is the difference between goal-directed and habitual behavior? We
propose a novel computational framework of decision making with Bayesian
inference, in which everything is integrated as an entire neural network model.
The model learns to predict environmental state transitions by self-exploration
and generating motor actions by sampling stochastic internal states ${z}$.
Habitual behavior, which is obtained from the prior distribution of ${z}$, is
acquired by reinforcement learning. Goal-directed behavior is determined from
the posterior distribution of ${z}$ by planning, using active inference which
optimizes the past, current and future ${z}$ by minimizing the variational free
energy for the desired future observation constrained by the observed sensory
sequence. We demonstrate the effectiveness of the proposed framework by
experiments in a sensorimotor navigation task with camera observations and
continuous motor actions.
Related papers
- A Neural Active Inference Model of Perceptual-Motor Learning [62.39667564455059]
The active inference framework (AIF) is a promising new computational framework grounded in contemporary neuroscience.
In this study, we test the ability for the AIF to capture the role of anticipation in the visual guidance of action in humans.
We present a novel formulation of the prior function that maps a multi-dimensional world-state to a uni-dimensional distribution of free-energy.
arXiv Detail & Related papers (2022-11-16T20:00:38Z) - Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion [88.45326906116165]
We present a new framework to formulate the trajectory prediction task as a reverse process of motion indeterminacy diffusion (MID)
We encode the history behavior information and the social interactions as a state embedding and devise a Transformer-based diffusion model to capture the temporal dependencies of trajectories.
Experiments on the human trajectory prediction benchmarks including the Stanford Drone and ETH/UCY datasets demonstrate the superiority of our method.
arXiv Detail & Related papers (2022-03-25T16:59:08Z) - Inference of Affordances and Active Motor Control in Simulated Agents [0.5161531917413706]
We introduce an output-probabilistic, temporally predictive, modular artificial neural network architecture.
We show that our architecture develops latent states that can be interpreted as affordance maps.
In combination with active inference, we show that flexible, goal-directed behavior can be invoked.
arXiv Detail & Related papers (2022-02-23T14:13:04Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - Inverse reinforcement learning for autonomous navigation via
differentiable semantic mapping and planning [20.66819092398541]
This paper focuses on inverse reinforcement learning for autonomous navigation using distance and semantic category observations.
We develop a map encoder, that infers semantic category probabilities from the observation sequence, and a cost encoder, defined as a deep neural network over the semantic features.
We show that our approach learns to follow traffic rules in the autonomous driving CARLA simulator by relying on semantic observations of buildings, sidewalks, and road lanes.
arXiv Detail & Related papers (2021-01-01T07:41:08Z) - Generative Temporal Difference Learning for Infinite-Horizon Prediction [101.59882753763888]
We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.
We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
arXiv Detail & Related papers (2020-10-27T17:54:12Z) - Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable
Semantic Representations [81.05412704590707]
We propose a novel end-to-end learnable network that performs joint perception, prediction and motion planning for self-driving vehicles.
Our network is learned end-to-end from human demonstrations.
arXiv Detail & Related papers (2020-08-13T14:40:46Z) - Tracking Emotions: Intrinsic Motivation Grounded on Multi-Level
Prediction Error Dynamics [68.8204255655161]
We discuss how emotions arise when differences between expected and actual rates of progress towards a goal are experienced.
We present an intrinsic motivation architecture that generates behaviors towards self-generated and dynamic goals.
arXiv Detail & Related papers (2020-07-29T06:53:13Z) - Learning Navigation Costs from Demonstration with Semantic Observations [24.457042947946025]
This paper focuses on inverse reinforcement learning (IRL) for autonomous robot navigation using semantic observations.
We develop a map encoder, which infers semantic class probabilities from the observation sequence, and a cost encoder, defined as deep neural network over the semantic features.
We show that our approach learns to follow traffic rules in the autonomous driving CARLA simulator by relying on semantic observations of cars, sidewalks and road lanes.
arXiv Detail & Related papers (2020-06-09T04:35:57Z) - Learning Navigation Costs from Demonstration in Partially Observable
Environments [24.457042947946025]
This paper focuses on inverse reinforcement learning (IRL) to enable safe and efficient autonomous navigation in unknown partially observable environments.
We develop a cost function representation composed of two parts: a probabilistic occupancy encoder, with recurrent dependence on the observation sequence, and a cost encoder, defined over the occupancy features.
Our model exceeds the accuracy of baseline IRL algorithms in robot navigation tasks, while substantially improving the efficiency of training and test-time inference.
arXiv Detail & Related papers (2020-02-26T17:15:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.