Information is Power: Intrinsic Control via Information Capture
- URL: http://arxiv.org/abs/2112.03899v1
- Date: Tue, 7 Dec 2021 18:50:42 GMT
- Title: Information is Power: Intrinsic Control via Information Capture
- Authors: Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D. Co-Reyes,
Danijar Hafner, Chelsea Finn, Sergey Levine
- Abstract summary: We argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
This objective induces an agent to both gather information about its environment, corresponding to reducing uncertainty, and to gain control over its environment, corresponding to reducing the unpredictability of future world states.
- Score: 110.3143711650806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans and animals explore their environment and acquire useful skills even
in the absence of clear goals, exhibiting intrinsic motivation. The study of
intrinsic motivation in artificial agents is concerned with the following
question: what is a good general-purpose objective for an agent? We study this
question in dynamic partially-observed environments, and argue that a compact
and general learning objective is to minimize the entropy of the agent's state
visitation estimated using a latent state-space model. This objective induces
an agent to both gather information about its environment, corresponding to
reducing uncertainty, and to gain control over its environment, corresponding
to reducing the unpredictability of future world states. We instantiate this
approach as a deep reinforcement learning agent equipped with a deep
variational Bayes filter. We find that our agent learns to discover, represent,
and exercise control of dynamic objects in a variety of partially-observed
environments sensed with visual observations without extrinsic reward.
Related papers
- Self-supervised network distillation: an effective approach to exploration in sparse reward environments [0.0]
Reinforcement learning can train an agent to behave in an environment according to a predesigned reward function.
The solution to such a problem may be to equip the agent with an intrinsic motivation that will provide informed exploration.
We present Self-supervised Network Distillation (SND), a class of intrinsic motivation algorithms based on the distillation error as a novelty indicator.
arXiv Detail & Related papers (2023-02-22T18:58:09Z) - Embodied Agents for Efficient Exploration and Smart Scene Description [47.82947878753809]
We tackle a setting for visual navigation in which an autonomous agent needs to explore and map an unseen indoor environment.
We propose and evaluate an approach that combines recent advances in visual robotic exploration and image captioning on images.
Our approach can generate smart scene descriptions that maximize semantic knowledge of the environment and avoid repetitions.
arXiv Detail & Related papers (2023-01-17T19:28:01Z) - Self-supervised Sequential Information Bottleneck for Robust Exploration
in Deep Reinforcement Learning [28.75574762244266]
In this work, we introduce the sequential information bottleneck objective for learning compressed and temporally coherent representations.
For efficient exploration in noisy environments, we further construct intrinsic rewards that capture task-relevant state novelty.
arXiv Detail & Related papers (2022-09-12T15:41:10Z) - Active Inference for Robotic Manipulation [30.692885688744507]
Active Inference is a theory that deals with partial observability in an explicit manner.
In this work, we apply Active Inference to a hard-to-explore simulated robotic manipulation tasks.
We show that the information-seeking behavior induced by Active Inference allows the agent to explore these challenging, sparse environments systematically.
arXiv Detail & Related papers (2022-06-01T12:19:38Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z) - Understanding the origin of information-seeking exploration in
probabilistic objectives for control [62.997667081978825]
An exploration-exploitation trade-off is central to the description of adaptive behaviour.
One approach to solving this trade-off has been to equip or propose that agents possess an intrinsic 'exploratory drive'
We show that this combination of utility maximizing and information-seeking behaviour arises from the minimization of an entirely difference class of objectives.
arXiv Detail & Related papers (2021-03-11T18:42:39Z) - Action and Perception as Divergence Minimization [43.75550755678525]
Action Perception Divergence is an approach for categorizing the space of possible objective functions for embodied agents.
We show a spectrum that reaches from narrow to general objectives.
These agents use perception to align their beliefs with the world and use actions to align the world with their beliefs.
arXiv Detail & Related papers (2020-09-03T16:52:46Z) - Modulation of viability signals for self-regulatory control [1.370633147306388]
We revisit the role of instrumental value as a driver of adaptive behavior.
For reinforcement learning tasks, the distribution of preferences replaces the notion of reward.
arXiv Detail & Related papers (2020-07-18T01:11:51Z) - Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier.
understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z) - Mutual Information-based State-Control for Intrinsically Motivated
Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals.
We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.