Deconstructing deep active inference
- URL: http://arxiv.org/abs/2303.01618v2
- Date: Mon, 8 May 2023 08:20:23 GMT
- Title: Deconstructing deep active inference
- Authors: Th\'eophile Champion and Marek Grze\'s and Lisa Bonheme and Howard
Bowman
- Abstract summary: Active inference is a theory of perception, learning and decision making.
The goal of this activity is to solve more complicated tasks using deep active inference.
- Score: 2.236663830879273
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Active inference is a theory of perception, learning and decision making,
which can be applied to neuroscience, robotics, and machine learning. Recently,
reasearch has been taking place to scale up this framework using Monte-Carlo
tree search and deep learning. The goal of this activity is to solve more
complicated tasks using deep active inference. First, we review the existing
literature, then, we progresively build a deep active inference agent. For two
agents, we have experimented with five definitions of the expected free energy
and three different action selection strategies. According to our experiments,
the models able to solve the dSprites environment are the ones that maximise
rewards. Finally, we compare the similarity of the representation learned by
the layers of various agents using centered kernel alignment. Importantly, the
agent maximising reward and the agent minimising expected free energy learn
very similar representations except for the last layer of the critic network
(reflecting the difference in learning objective), and the variance layers of
the transition and encoder networks. We found that the reward maximising agent
is a lot more certain than the agent minimising expected free energy. This is
because the agent minimising expected free energy always picks the action down,
and does not gather enough data for the other actions. In contrast, the agent
maximising reward, keeps on selecting the actions left and right, enabling it
to successfully solve the task. The only difference between those two agents is
the epistemic value, which aims to make the outputs of the transition and
encoder networks as close as possible. Thus, the agent minimising expected free
energy picks a single action (down), and becomes an expert at predicting the
future when selecting this action. This makes the KL divergence between the
output of the transition and encoder networks small.
Related papers
- DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement
Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents.
We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z) - Decentralized scheduling through an adaptive, trading-based multi-agent
system [1.7403133838762448]
In multi-agent reinforcement learning systems, the actions of one agent can have a negative impact on the rewards of other agents.
This work applies a trading approach to a simulated scheduling environment, where the agents are responsible for the assignment of incoming jobs to compute cores.
The agents can trade the usage right of computational cores to process high-priority, high-reward jobs faster than low-priority, low-reward jobs.
arXiv Detail & Related papers (2022-07-05T13:50:18Z) - On the Expressivity of Markov Reward [89.96685777114456]
This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform.
We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories.
arXiv Detail & Related papers (2021-11-01T12:12:16Z) - Contrastive Active Inference [12.361539023886161]
We propose a contrastive objective for active inference that reduces the computational burden in learning the agent's generative model and planning future actions.
Our method performs notably better than likelihood-based active inference in image-based tasks, while also being computationally cheaper and easier to train.
arXiv Detail & Related papers (2021-10-19T16:20:49Z) - Multi-Agent Embodied Visual Semantic Navigation with Scene Prior
Knowledge [42.37872230561632]
In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given.
Most of the existing models are only effective for single-agent navigation, and a single agent has low efficiency and poor fault tolerance when completing more complicated tasks.
We propose the multi-agent visual semantic navigation, in which multiple agents collaborate with others to find multiple target objects.
arXiv Detail & Related papers (2021-09-20T13:31:03Z) - What is Going on Inside Recurrent Meta Reinforcement Learning Agents? [63.58053355357644]
Recurrent meta reinforcement learning (meta-RL) agents are agents that employ a recurrent neural network (RNN) for the purpose of "learning a learning algorithm"
We shed light on the internal working mechanisms of these agents by reformulating the meta-RL problem using the Partially Observable Markov Decision Process (POMDP) framework.
arXiv Detail & Related papers (2021-04-29T20:34:39Z) - Learning to Incentivize Other Learning Agents [73.03133692589532]
We show how to equip RL agents with the ability to give rewards directly to other agents, using a learned incentive function.
Such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games.
Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.
arXiv Detail & Related papers (2020-06-10T20:12:38Z) - Deep active inference agents using Monte-Carlo methods [3.8233569758620054]
We present a neural architecture for building deep active inference agents in continuous state-spaces using Monte-Carlo sampling.
Our approach enables agents to learn environmental dynamics efficiently, while maintaining task performance.
Results show that deep active inference provides a flexible framework to develop biologically-inspired intelligent agents.
arXiv Detail & Related papers (2020-06-07T15:10:42Z) - Maximizing Information Gain in Partially Observable Environments via
Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z) - Intrinsic Motivation for Encouraging Synergistic Behavior [55.10275467562764]
We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks.
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
arXiv Detail & Related papers (2020-02-12T19:34:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.