Intrinsically Motivated Self-supervised Learning in Reinforcement
Learning
- URL: http://arxiv.org/abs/2106.13970v1
- Date: Sat, 26 Jun 2021 08:43:28 GMT
- Title: Intrinsically Motivated Self-supervised Learning in Reinforcement
Learning
- Authors: Yue Zhao, Chenzhuang Du, Hang Zhao, Tiejun Li
- Abstract summary: In vision-based reinforcement learning (RL) tasks, it is prevalent to assign the auxiliary task with a surrogate self-supervised loss.
We present a simple yet effective idea to employ self-supervised loss as an intrinsic reward, called Intrinsically Motivated Self-Supervised learning in Reinforcement learning (IM-SSR)
We show that the self-supervised loss can be robustness as exploration for novel states and improvement from nuisance elimination.
- Score: 15.809835721792687
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In vision-based reinforcement learning (RL) tasks, it is prevalent to assign
the auxiliary task with a surrogate self-supervised loss so as to obtain more
semantic representations and improve sample efficiency. However, abundant
information in self-supervised auxiliary tasks has been disregarded, since the
representation learning part and the decision-making part are separated. To
sufficiently utilize information in the auxiliary task, we present a simple yet
effective idea to employ self-supervised loss as an intrinsic reward, called
Intrinsically Motivated Self-Supervised learning in Reinforcement learning
(IM-SSR). We formally show that the self-supervised loss can be decomposed as
exploration for novel states and robustness improvement from nuisance
elimination. IM-SSR can be effortlessly plugged into any reinforcement learning
with self-supervised auxiliary objectives with nearly no additional cost.
Combined with IM-SSR, the previous underlying algorithms achieve salient
improvements on both sample efficiency and generalization in various
vision-based robotics tasks from the DeepMind Control Suite, especially when
the reward signal is sparse.
Related papers
- Auxiliary Reward Generation with Transition Distance Representation
Learning [20.150691753213817]
Reinforcement learning (RL) has shown its strength in challenging sequential decision-making problems.
The reward function in RL is crucial to the learning performance, as it serves as a measure of the task completion degree.
We propose a novel representation learning approach that can measure the transition distance'' between states.
arXiv Detail & Related papers (2024-02-12T05:13:44Z) - Augmenting Unsupervised Reinforcement Learning with Self-Reference [63.68018737038331]
Humans possess the ability to draw on past experiences explicitly when learning new tasks.
We propose the Self-Reference (SR) approach, an add-on module explicitly designed to leverage historical information.
Our approach achieves state-of-the-art results in terms of Interquartile Mean (IQM) performance and Optimality Gap reduction on the Unsupervised Reinforcement Learning Benchmark.
arXiv Detail & Related papers (2023-11-16T09:07:34Z) - Sequential Action-Induced Invariant Representation for Reinforcement
Learning [1.2046159151610263]
How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a challenging problem in visual reinforcement learning.
We propose a Sequential Action-induced invariant Representation (SAR) method, in which the encoder is optimized by an auxiliary learner to only preserve the components that follow the control signals of sequential actions.
arXiv Detail & Related papers (2023-09-22T05:31:55Z) - Composite Learning for Robust and Effective Dense Predictions [81.2055761433725]
Multi-task learning promises better model generalization on a target task by jointly optimizing it with an auxiliary task.
We find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.
arXiv Detail & Related papers (2022-10-13T17:59:16Z) - ReIL: A Framework for Reinforced Intervention-based Imitation Learning [3.0846824529023387]
We introduce Reinforced Intervention-based Learning (ReIL), a framework consisting of a general intervention-based learning algorithm and a multi-task imitation learning model.
Experimental results from real world mobile robot navigation challenges indicate that ReIL learns rapidly from sparse supervisor corrections without suffering deterioration in performance.
arXiv Detail & Related papers (2022-03-29T09:30:26Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - Evaluating the Robustness of Self-Supervised Learning in Medical Imaging [57.20012795524752]
Self-supervision has demonstrated to be an effective learning strategy when training target tasks on small annotated data-sets.
We show that networks trained via self-supervised learning have superior robustness and generalizability compared to fully-supervised learning in the context of medical imaging.
arXiv Detail & Related papers (2021-05-14T17:49:52Z) - Return-Based Contrastive Representation Learning for Reinforcement
Learning [126.7440353288838]
We propose a novel auxiliary task that forces the learnt representations to discriminate state-action pairs with different returns.
Our algorithm outperforms strong baselines on complex tasks in Atari games and DeepMind Control suite.
arXiv Detail & Related papers (2021-02-22T13:04:18Z) - Bridging the Imitation Gap by Adaptive Insubordination [88.35564081175642]
We show that when the teaching agent makes decisions with access to privileged information, this information is marginalized during imitation learning.
We propose 'Adaptive Insubordination' (ADVISOR) to address this gap.
ADVISOR dynamically weights imitation and reward-based reinforcement learning losses during training, enabling on-the-fly switching between imitation and exploration.
arXiv Detail & Related papers (2020-07-23T17:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.