Learning Task-relevant Representations for Generalization via
Characteristic Functions of Reward Sequence Distributions
- URL: http://arxiv.org/abs/2205.10218v1
- Date: Fri, 20 May 2022 14:52:03 GMT
- Title: Learning Task-relevant Representations for Generalization via
Characteristic Functions of Reward Sequence Distributions
- Authors: Rui Yang, Jie Wang, Zijie Geng, Mingxuan Ye, Shuiwang Ji, Bin Li, Feng
Wu
- Abstract summary: Generalization across different environments with the same tasks is critical for successful applications of visual reinforcement learning.
We propose a novel approach, namely Characteristic Reward Sequence Prediction (CRESP), to extract the task-relevant information.
Experiments demonstrate that CRESP significantly improves the performance of generalization on unseen environments.
- Score: 63.773813221460614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalization across different environments with the same tasks is critical
for successful applications of visual reinforcement learning (RL) in real
scenarios. However, visual distractions -- which are common in real scenes --
from high-dimensional observations can be hurtful to the learned
representations in visual RL, thus degrading the performance of generalization.
To tackle this problem, we propose a novel approach, namely Characteristic
Reward Sequence Prediction (CRESP), to extract the task-relevant information by
learning reward sequence distributions (RSDs), as the reward signals are
task-relevant in RL and invariant to visual distractions. Specifically, to
effectively capture the task-relevant information via RSDs, CRESP introduces an
auxiliary task -- that is, predicting the characteristic functions of RSDs --
to learn task-relevant representations, because we can well approximate the
high-dimensional distributions by leveraging the corresponding characteristic
functions. Experiments demonstrate that CRESP significantly improves the
performance of generalization on unseen environments, outperforming several
state-of-the-arts on DeepMind Control tasks with different visual distractions.
Related papers
- Intrinsic Dynamics-Driven Generalizable Scene Representations for Vision-Oriented Decision-Making Applications [0.21051221444478305]
How to improve the ability of scene representation is a key issue in vision-oriented decision-making applications.
We propose an intrinsic dynamics-driven representation learning method with sequence models in visual reinforcement learning.
arXiv Detail & Related papers (2024-05-30T06:31:03Z) - Improving Reinforcement Learning Efficiency with Auxiliary Tasks in
Non-Visual Environments: A Comparison [0.0]
This study compares common auxiliary tasks based on, to the best of our knowledge, the only decoupled representation learning method for low-dimensional non-visual observations.
Our findings show that representation learning with auxiliary tasks only provides performance gains in sufficiently complex environments.
arXiv Detail & Related papers (2023-10-06T13:22:26Z) - Sequential Action-Induced Invariant Representation for Reinforcement
Learning [1.2046159151610263]
How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a challenging problem in visual reinforcement learning.
We propose a Sequential Action-induced invariant Representation (SAR) method, in which the encoder is optimized by an auxiliary learner to only preserve the components that follow the control signals of sequential actions.
arXiv Detail & Related papers (2023-09-22T05:31:55Z) - Generalization in Visual Reinforcement Learning with the Reward Sequence
Distribution [98.67737684075587]
Generalization in partially observed markov decision processes (POMDPs) is critical for successful applications of visual reinforcement learning (VRL)
We propose the reward sequence distribution conditioned on the starting observation and the predefined subsequent action sequence (RSD-OA)
Experiments demonstrate that our representation learning approach based on RSD-OA significantly improves the generalization performance on unseen environments.
arXiv Detail & Related papers (2023-02-19T15:47:24Z) - Task Formulation Matters When Learning Continually: A Case Study in
Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge.
We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z) - Learning to Relate Depth and Semantics for Unsupervised Domain
Adaptation [87.1188556802942]
We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting.
We propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions.
Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain.
arXiv Detail & Related papers (2021-05-17T13:42:09Z) - Reinforcement Learning with Prototypical Representations [114.35801511501639]
Proto-RL is a self-supervised framework that ties representation learning with exploration through prototypical representations.
These prototypes simultaneously serve as a summarization of the exploratory experience of an agent as well as a basis for representing observations.
This enables state-of-the-art downstream policy learning on a set of difficult continuous control tasks.
arXiv Detail & Related papers (2021-02-22T18:56:34Z) - Return-Based Contrastive Representation Learning for Reinforcement
Learning [126.7440353288838]
We propose a novel auxiliary task that forces the learnt representations to discriminate state-action pairs with different returns.
Our algorithm outperforms strong baselines on complex tasks in Atari games and DeepMind Control suite.
arXiv Detail & Related papers (2021-02-22T13:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.