Complex behavior from intrinsic motivation to occupy action-state path
space
- URL: http://arxiv.org/abs/2205.10316v2
- Date: Sat, 24 Feb 2024 05:29:54 GMT
- Title: Complex behavior from intrinsic motivation to occupy action-state path
space
- Authors: Jorge Ram\'irez-Ruiz, Dmytro Grytskyy, Chiara Mastrogiuseppe, Yamen
Habib and Rub\'en Moreno-Bote
- Abstract summary: We propose that the goal of behavior is maximizing future occupancy of paths of actions and states.
According to this occupancy principle, rewards are the means to occupy path space, not the goal per se.
We show that complex behaviors such as dancing', hide-and-seek and a basic form of altruistic behavior naturally result from the intrinsic motivation to occupy path space.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most theories of behavior posit that agents tend to maximize some form of
reward or utility. However, animals very often move with curiosity and seem to
be motivated in a reward-free manner. Here we abandon the idea of reward
maximization, and propose that the goal of behavior is maximizing occupancy of
future paths of actions and states. According to this maximum occupancy
principle, rewards are the means to occupy path space, not the goal per se;
goal-directedness simply emerges as rational ways of searching for resources so
that movement, understood amply, never ends. We find that action-state path
entropy is the only measure consistent with additivity and other intuitive
properties of expected future action-state path occupancy. We provide
analytical expressions that relate the optimal policy and state-value function,
and prove convergence of our value iteration algorithm. Using discrete and
continuous state tasks, including a high--dimensional controller, we show that
complex behaviors such as `dancing', hide-and-seek and a basic form of
altruistic behavior naturally result from the intrinsic motivation to occupy
path space. All in all, we present a theory of behavior that generates both
variability and goal-directedness in the absence of reward maximization.
Related papers
- Wanting to be Understood [7.40601112616244]
This paper explores an intrinsic motivation for mutual awareness, hypothesizing that humans possess a fundamental drive to understand.
Through simulations of the perceptual crossing paradigm, we explore the effect of various internal reward functions in reinforcement learning agents.
Results indicate that while artificial curiosity alone does not lead to a preference for social interaction, rewards emphasizing reciprocal understanding successfully drive agents to prioritize interaction.
arXiv Detail & Related papers (2025-04-09T06:15:24Z) - Deceptive Sequential Decision-Making via Regularized Policy Optimization [54.38738815697299]
Two regularization strategies for policy synthesis problems that actively deceive an adversary about a system's underlying rewards are presented.
We show how each form of deception can be implemented in policy optimization problems.
We show that diversionary deception can cause the adversary to believe that the most important agent is the least important, while attaining a total accumulated reward that is $98.83%$ of its optimal, non-deceptive value.
arXiv Detail & Related papers (2025-01-30T23:41:40Z) - Go Beyond Imagination: Maximizing Episodic Reachability with World
Models [68.91647544080097]
In this paper, we introduce a new intrinsic reward design called GoBI - Go Beyond Imagination.
We apply learned world models to generate predicted future states with random actions.
Our method greatly outperforms previous state-of-the-art methods on 12 of the most challenging Minigrid navigation tasks.
arXiv Detail & Related papers (2023-08-25T20:30:20Z) - Intrinsic Motivation in Dynamical Control Systems [5.635628182420597]
We investigate an information-theoretic approach to intrinsic motivation, based on maximizing an agent's empowerment.
We show that this approach generalizes previous attempts to formalize intrinsic motivation.
This opens the door for designing practical artificial, intrinsically motivated controllers.
arXiv Detail & Related papers (2022-12-29T05:20:08Z) - Contrastive Active Inference [12.361539023886161]
We propose a contrastive objective for active inference that reduces the computational burden in learning the agent's generative model and planning future actions.
Our method performs notably better than likelihood-based active inference in image-based tasks, while also being computationally cheaper and easier to train.
arXiv Detail & Related papers (2021-10-19T16:20:49Z) - Mutual Information State Intrinsic Control [91.38627985733068]
Intrinsically motivated RL attempts to remove this constraint by defining an intrinsic reward function.
Motivated by the self-consciousness concept in psychology, we make a natural assumption that the agent knows what constitutes itself.
We mathematically formalize this reward as the mutual information between the agent state and the surrounding state.
arXiv Detail & Related papers (2021-03-15T03:03:36Z) - Inverse Rational Control with Partially Observable Continuous Nonlinear
Dynamics [6.65264113799989]
A fundamental question in neuroscience is how the brain creates an internal model of the world to guide actions using sequences of ambiguous sensory information.
This problem can be solved by control theory, which allows us to find the optimal actions for a given system dynamics and objective function.
We hypothesize that animals have their own flawed internal model of the world, and choose actions with the highest expected subjective reward according to that flawed model.
Our contribution here generalizes past work on Inverse Rational Control which solved this problem for discrete control in partially observable Markov decision processes.
arXiv Detail & Related papers (2020-09-26T11:47:48Z) - Tracking Emotions: Intrinsic Motivation Grounded on Multi-Level
Prediction Error Dynamics [68.8204255655161]
We discuss how emotions arise when differences between expected and actual rates of progress towards a goal are experienced.
We present an intrinsic motivation architecture that generates behaviors towards self-generated and dynamic goals.
arXiv Detail & Related papers (2020-07-29T06:53:13Z) - Maximizing Information Gain in Partially Observable Environments via
Prediction Reward [64.24528565312463]
This paper tackles the challenge of using belief-based rewards for a deep RL agent.
We derive the exact error between negative entropy and the expected prediction reward.
This insight provides theoretical motivation for several fields using prediction rewards.
arXiv Detail & Related papers (2020-05-11T08:13:49Z) - Intrinsic Motivation for Encouraging Synergistic Behavior [55.10275467562764]
We study the role of intrinsic motivation as an exploration bias for reinforcement learning in sparse-reward synergistic tasks.
Our key idea is that a good guiding principle for intrinsic motivation in synergistic tasks is to take actions which affect the world in ways that would not be achieved if the agents were acting on their own.
arXiv Detail & Related papers (2020-02-12T19:34:51Z) - Mutual Information-based State-Control for Intrinsically Motivated
Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals.
We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.