Reinforcement Learning through Active Inference
- URL: http://arxiv.org/abs/2002.12636v1
- Date: Fri, 28 Feb 2020 10:28:21 GMT
- Title: Reinforcement Learning through Active Inference
- Authors: Alexander Tschantz, Beren Millidge, Anil K. Seth, Christopher L.
Buckley
- Abstract summary: We show how ideas from active inference can augment traditional reinforcement learning approaches.
We develop and implement a novel objective for decision making, which we term the free energy of the expected future.
We demonstrate that the resulting algorithm successfully exploration and exploitation, simultaneously achieving robust performance on several challenging RL benchmarks with sparse, well-shaped, and no rewards.
- Score: 62.997667081978825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The central tenet of reinforcement learning (RL) is that agents seek to
maximize the sum of cumulative rewards. In contrast, active inference, an
emerging framework within cognitive and computational neuroscience, proposes
that agents act to maximize the evidence for a biased generative model. Here,
we illustrate how ideas from active inference can augment traditional RL
approaches by (i) furnishing an inherent balance of exploration and
exploitation, and (ii) providing a more flexible conceptualization of reward.
Inspired by active inference, we develop and implement a novel objective for
decision making, which we term the free energy of the expected future. We
demonstrate that the resulting algorithm successfully balances exploration and
exploitation, simultaneously achieving robust performance on several
challenging RL benchmarks with sparse, well-shaped, and no rewards.
Related papers
- On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks.
We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly.
In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z) - A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning [48.59516337905877]
Learning a good representation is a crucial challenge for Reinforcement Learning (RL) agents.
Recent work has developed theoretical insights into these algorithms.
We take a step towards bridging the gap between theory and practice by analyzing an action-conditional self-predictive objective.
arXiv Detail & Related papers (2024-06-04T07:22:12Z) - REACT: Revealing Evolutionary Action Consequence Trajectories for Interpretable Reinforcement Learning [7.889696505137217]
We propose Revealing Evolutionary Action Consequence Trajectories (REACT) to enhance the interpretability of Reinforcement Learning (RL)
In contrast to the prevalent practice of RL models based on their optimal behavior learned during training, we posit that considering a range of edge-case trajectories provides a more comprehensive understanding of their inherent behavior.
Our results highlight its effectiveness in revealing nuanced aspects of RL models' behavior beyond optimal performance, thereby contributing to improved interpretability.
arXiv Detail & Related papers (2024-04-04T10:56:30Z) - Learning Off-policy with Model-based Intrinsic Motivation For Active Online Exploration [15.463313629574111]
This paper investigates how to achieve sample-efficient exploration in continuous control tasks.
We introduce an RL algorithm that incorporates a predictive model and off-policy learning elements.
We derive an intrinsic reward without incurring parameters overhead.
arXiv Detail & Related papers (2024-03-31T11:39:11Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Active Inference and Reinforcement Learning: A unified inference on continuous state and action spaces under partial observability [19.56438470022024]
Many real-world problems involve partial observations, formulated as partially observable decision processes (POMDPs)
Previous studies have tackled RL in POMDPs by either incorporating the memory of past actions and observations or by inferring the true state of the environment.
We propose a unified principle that establishes a theoretical connection between Active inference (AIF) andReinforcement learning (RL)
Experimental results demonstrate the superior learning capabilities of our method in solving continuous space partially observable tasks.
arXiv Detail & Related papers (2022-12-15T16:28:06Z) - Intrinsically-Motivated Reinforcement Learning: A Brief Introduction [0.0]
Reinforcement learning (RL) is one of the three basic paradigms of machine learning.
In this paper, we investigated the problem of improving exploration in RL and introduced the intrinsically-motivated RL.
arXiv Detail & Related papers (2022-03-03T12:39:58Z) - Online reinforcement learning with sparse rewards through an active
inference capsule [62.997667081978825]
This paper introduces an active inference agent which minimizes the novel free energy of the expected future.
Our model is capable of solving sparse-reward problems with a very high sample efficiency.
We also introduce a novel method for approximating the prior model from the reward function, which simplifies the expression of complex objectives.
arXiv Detail & Related papers (2021-06-04T10:03:36Z) - Imitation with Neural Density Models [98.34503611309256]
We propose a new framework for Imitation Learning (IL) via density estimation of the expert's occupancy measure followed by Imitation Occupancy Entropy Reinforcement Learning (RL) using the density as a reward.
Our approach maximizes a non-adversarial model-free RL objective that provably lower bounds reverse Kullback-Leibler divergence between occupancy measures of the expert and imitator.
arXiv Detail & Related papers (2020-10-19T19:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.