Regularity as Intrinsic Reward for Free Play
- URL: http://arxiv.org/abs/2312.01473v1
- Date: Sun, 3 Dec 2023 18:18:44 GMT
- Title: Regularity as Intrinsic Reward for Free Play
- Authors: Cansu Sancaktar, Justus Piater, Georg Martius
- Abstract summary: We propose regularity as a novel reward signal for intrinsically-motivated reinforcement learning.
Our generalized formulation of Regularity as Intrinsic Reward (RaIR) allows us to operationalize it within model-based reinforcement learning.
- Score: 24.29379265146469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose regularity as a novel reward signal for intrinsically-motivated
reinforcement learning. Taking inspiration from child development, we postulate
that striving for structure and order helps guide exploration towards a
subspace of tasks that are not favored by naive uncertainty-based intrinsic
rewards. Our generalized formulation of Regularity as Intrinsic Reward (RaIR)
allows us to operationalize it within model-based reinforcement learning. In a
synthetic environment, we showcase the plethora of structured patterns that can
emerge from pursuing this regularity objective. We also demonstrate the
strength of our method in a multi-object robotic manipulation environment. We
incorporate RaIR into free play and use it to complement the model's epistemic
uncertainty as an intrinsic reward. Doing so, we witness the autonomous
construction of towers and other regular structures during free play, which
leads to a substantial improvement in zero-shot downstream task performance on
assembly tasks.
Related papers
- Random Latent Exploration for Deep Reinforcement Learning [71.88709402926415]
This paper introduces a new exploration technique called Random Latent Exploration (RLE)
RLE combines the strengths of bonus-based and noise-based (two popular approaches for effective exploration in deep RL) exploration strategies.
We evaluate it on the challenging Atari and IsaacGym benchmarks and show that RLE exhibits higher overall scores across all the tasks than other approaches.
arXiv Detail & Related papers (2024-07-18T17:55:22Z) - Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement
Learning [55.2080971216584]
We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement learning (RL)
We develop an intrinsic reward toolkit to provide efficient and reliable implementations of diverse intrinsic reward approaches.
arXiv Detail & Related papers (2023-01-26T01:06:46Z) - Foundation Models for Semantic Novelty in Reinforcement Learning [32.707788771181676]
Our intrinsic reward is defined based on pre-trained CLIP embeddings without any fine-tuning or learning on the target RL task.
We demonstrate that CLIP-based intrinsic rewards can drive exploration towards semantically meaningful states and outperform state-of-the-art methods in challenging sparse-reward procedurally-generated environments.
arXiv Detail & Related papers (2022-11-09T13:34:45Z) - Reward Learning using Structural Motifs in Inverse Reinforcement
Learning [3.04585143845864]
Inverse Reinforcement Learning (textitIRL) problem has seen rapid evolution in the past few years, with important applications in domains like robotics, cognition, and health.
We explore the inefficacy of current IRL methods in learning an agent's reward function from expert trajectories depicting long-horizon, complex sequential tasks.
We propose a novel IRL method, SMIRL, that first learns the (approximate) structure of a task as a finite-state-automaton (FSA), then uses the structural motif to solve the IRL problem.
arXiv Detail & Related papers (2022-09-25T18:34:59Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - Hindsight Reward Tweaking via Conditional Deep Reinforcement Learning [37.61951923445689]
We propose a novel paradigm for deep reinforcement learning to model the influences of reward functions within a near-optimal space.
We demonstrate the feasibility of this approach and study one of its potential application in policy performance boosting with multiple MuJoCo tasks.
arXiv Detail & Related papers (2021-09-06T10:06:48Z) - Self-Supervised Exploration via Latent Bayesian Surprise [4.088019409160893]
In this work, we propose a curiosity-based bonus as intrinsic reward for Reinforcement Learning.
We extensively evaluate our model by measuring the agent's performance in terms of environment exploration.
Our model is cheap and empirically shows state-of-the-art performance on several problems.
arXiv Detail & Related papers (2021-04-15T14:40:16Z) - Bridging the Imitation Gap by Adaptive Insubordination [88.35564081175642]
We show that when the teaching agent makes decisions with access to privileged information, this information is marginalized during imitation learning.
We propose 'Adaptive Insubordination' (ADVISOR) to address this gap.
ADVISOR dynamically weights imitation and reward-based reinforcement learning losses during training, enabling on-the-fly switching between imitation and exploration.
arXiv Detail & Related papers (2020-07-23T17:59:57Z) - Learning with AMIGo: Adversarially Motivated Intrinsic Goals [63.680207855344875]
AMIGo is a goal-generating teacher that proposes Adversarially Motivated Intrinsic Goals.
We show that our method generates a natural curriculum of self-proposed goals which ultimately allows the agent to solve challenging procedurally-generated tasks.
arXiv Detail & Related papers (2020-06-22T10:22:08Z) - Mutual Information-based State-Control for Intrinsically Motivated
Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals.
We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.