Evolution of Rewards for Food and Motor Action by Simulating Birth and Death
- URL: http://arxiv.org/abs/2406.15016v1
- Date: Fri, 21 Jun 2024 09:44:56 GMT
- Title: Evolution of Rewards for Food and Motor Action by Simulating Birth and Death
- Authors: Yuji Kanagawa, Kenji Doya,
- Abstract summary: We try to replicate the evolution of biologically plausible reward functions and investigate how environmental conditions affect evolved rewards' shape.
Our results show that biologically reasonable positive rewards for food acquisition and negative rewards for motor action can evolve from randomly ones.
The emergence of positive motor action rewards is surprising because it can make agents too active and inefficient in foraging.
- Score: 1.9928758704251783
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The reward system is one of the fundamental drivers of animal behaviors and is critical for survival and reproduction. Despite its importance, the problem of how the reward system has evolved is underexplored. In this paper, we try to replicate the evolution of biologically plausible reward functions and investigate how environmental conditions affect evolved rewards' shape. For this purpose, we developed a population-based decentralized evolutionary simulation framework, where agents maintain their energy level to live longer and produce more children. Each agent inherits its reward function from its parent subject to mutation and learns to get rewards via reinforcement learning throughout its lifetime. Our results show that biologically reasonable positive rewards for food acquisition and negative rewards for motor action can evolve from randomly initialized ones. However, we also find that the rewards for motor action diverge into two modes: largely positive and slightly negative. The emergence of positive motor action rewards is surprising because it can make agents too active and inefficient in foraging. In environments with poor and poisonous foods, the evolution of rewards for less important foods tends to be unstable, while rewards for normal foods are still stable. These results demonstrate the usefulness of our simulation environment and energy-dependent birth and death model for further studies of the origin of reward systems.
Related papers
- Evolution of Fear and Social Rewards in Prey-Predator Relationship [1.9928758704251783]
Fear is a critical brain function for detecting danger and learning to avoid specific stimuli that can lead to danger.<n>To investigate the relationship between environmental conditions, the evolution of fear, and the evolution of other rewards, we developed a distributed evolutionary simulation.<n>Surprisingly, our simulation revealed that social reward for observing the same species is more important for prey to survive.
arXiv Detail & Related papers (2025-07-14T07:27:18Z) - Avoiding Death through Fear Intrinsic Conditioning [48.07595141865156]
We introduce an intrinsic reward function inspired by early amygdala development and produce this intrinsic reward through a novel memory-augmented neural network architecture.<n>We show how this intrinsic motivation serves to deter exploration of terminal states and results in avoidance behavior similar to fear conditioning observed in animals.
arXiv Detail & Related papers (2025-06-05T19:24:51Z) - Emergent kin selection of altruistic feeding via non-episodic neuroevolution [2.296343533657165]
We show the first demonstration of kin selection emerging naturally within a population of agents undergoing continuous neuroevolution.
Specifically, we find that zero-sum transfer of resources from parents to their infant offspring evolves through kin selection in environments where it is hard for offspring to survive alone.
In an additional experiment, we show that kin selection in our simulations relies on a combination of kin recognition and population viscosity.
arXiv Detail & Related papers (2024-11-15T19:17:51Z) - Continuously evolving rewards in an open-ended environment [0.0]
RULE: Reward Updating through Learning and Expectation is tested in a simplified ecosystem-like setting.
The population of entities successfully demonstrate the abandonment of an initially rewarded but ultimately detrimental behaviour.
These adjustment happen through endogenous modification of the entities' underlying reward function, during continuous learning, without external intervention.
arXiv Detail & Related papers (2024-05-02T13:07:56Z) - Go Beyond Imagination: Maximizing Episodic Reachability with World
Models [68.91647544080097]
In this paper, we introduce a new intrinsic reward design called GoBI - Go Beyond Imagination.
We apply learned world models to generate predicted future states with random actions.
Our method greatly outperforms previous state-of-the-art methods on 12 of the most challenging Minigrid navigation tasks.
arXiv Detail & Related papers (2023-08-25T20:30:20Z) - Learning Goal-based Movement via Motivational-based Models in Cognitive
Mobile Robots [58.720142291102135]
Humans have needs motivating their behavior according to intensity and context.
We also create preferences associated with each action's perceived pleasure, which is susceptible to changes over time.
This makes decision-making more complex, requiring learning to balance needs and preferences according to the context.
arXiv Detail & Related papers (2023-02-20T04:52:24Z) - Tiered Reward: Designing Rewards for Specification and Fast Learning of Desired Behavior [13.409265335314169]
Tiered Reward is a class of environment-independent reward functions.
We show it is guaranteed to induce policies that are optimal according to our preference relation.
arXiv Detail & Related papers (2022-12-07T15:55:00Z) - Automatic Reward Design via Learning Motivation-Consistent Intrinsic
Rewards [46.068337522093096]
We introduce the concept of motivation which captures the underlying goal of maximizing certain rewards.
Our method performs better than the state-of-the-art methods in handling problems of delayed reward, exploration, and credit assignment.
arXiv Detail & Related papers (2022-07-29T14:52:02Z) - The Introspective Agent: Interdependence of Strategy, Physiology, and
Sensing for Embodied Agents [51.94554095091305]
We argue for an introspective agent, which considers its own abilities in the context of its environment.
Just as in nature, we hope to reframe strategy as one tool, among many, to succeed in an environment.
arXiv Detail & Related papers (2022-01-02T20:14:01Z) - Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier.
understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z) - Novelty Search makes Evolvability Inevitable [62.997667081978825]
We show that Novelty Search implicitly creates a pressure for high evolvability even in bounded behavior spaces.
We show that, throughout the search, the dynamic evaluation of novelty rewards individuals which are very mobile in the behavior space.
arXiv Detail & Related papers (2020-05-13T09:32:07Z) - Mimicking Evolution with Reinforcement Learning [10.35437633064506]
We argue that the path to developing artificial human-like-intelligence will pass through mimicking the evolutionary process in a nature-like simulation.
This work proposes Evolution via Evolutionary Reward (EvER) that allows learning to single-handedly drive the search for policies with increasingly evolutionary fitness.
arXiv Detail & Related papers (2020-03-31T18:16:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.