Learning Generalizable Representations for Reinforcement Learning via
Adaptive Meta-learner of Behavioral Similarities
- URL: http://arxiv.org/abs/2212.13088v1
- Date: Mon, 26 Dec 2022 11:11:23 GMT
- Title: Learning Generalizable Representations for Reinforcement Learning via
Adaptive Meta-learner of Behavioral Similarities
- Authors: Jianda Chen, Sinno Jialin Pan
- Abstract summary: We propose a novel meta-learner-based framework for representation learning regarding behavioral similarities for reinforcement learning.
We empirically demonstrate that our proposed framework outperforms state-of-the-art baselines on several benchmarks.
- Score: 43.327357653393015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How to learn an effective reinforcement learning-based model for control
tasks from high-level visual observations is a practical and challenging
problem. A key to solving this problem is to learn low-dimensional state
representations from observations, from which an effective policy can be
learned. In order to boost the learning of state encoding, recent works are
focused on capturing behavioral similarities between state representations or
applying data augmentation on visual observations. In this paper, we propose a
novel meta-learner-based framework for representation learning regarding
behavioral similarities for reinforcement learning. Specifically, our framework
encodes the high-dimensional observations into two decomposed embeddings
regarding reward and dynamics in a Markov Decision Process (MDP). A pair of
meta-learners are developed, one of which quantifies the reward similarity and
the other quantifies dynamics similarity over the correspondingly decomposed
embeddings. The meta-learners are self-learned to update the state embeddings
by approximating two disjoint terms in on-policy bisimulation metric. To
incorporate the reward and dynamics terms, we further develop a strategy to
adaptively balance their impacts based on different tasks or environments. We
empirically demonstrate that our proposed framework outperforms
state-of-the-art baselines on several benchmarks, including conventional DM
Control Suite, Distracting DM Control Suite and a self-driving task CARLA.
Related papers
- MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement Learning [8.61492882526007]
In visual Reinforcement Learning (RL), learning from pixel-based observations poses significant challenges on sample efficiency.
We introduce MOOSS, a novel framework that leverages a temporal contrastive objective with the help of graph-based spatial-temporal masking.
Our evaluation on multiple continuous and discrete control benchmarks shows that MOOSS outperforms previous state-of-the-art visual RL methods in terms of sample efficiency.
arXiv Detail & Related papers (2024-09-02T18:57:53Z) - Self-Supervised Representation Learning with Meta Comprehensive
Regularization [11.387994024747842]
We introduce a module called CompMod with Meta Comprehensive Regularization (MCR), embedded into existing self-supervised frameworks.
We update our proposed model through a bi-level optimization mechanism, enabling it to capture comprehensive features.
We provide theoretical support for our proposed method from information theory and causal counterfactual perspective.
arXiv Detail & Related papers (2024-03-03T15:53:48Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - Robust Task Representations for Offline Meta-Reinforcement Learning via
Contrastive Learning [21.59254848913971]
offline meta-reinforcement learning is a reinforcement learning paradigm that learns from offline data to adapt to new tasks.
We propose a contrastive learning framework for task representations that are robust to the distribution of behavior policies in training and test.
Experiments on a variety of offline meta-reinforcement learning benchmarks demonstrate the advantages of our method over prior methods.
arXiv Detail & Related papers (2022-06-21T14:46:47Z) - Adaptive Hierarchical Similarity Metric Learning with Noisy Labels [138.41576366096137]
We propose an Adaptive Hierarchical Similarity Metric Learning method.
It considers two noise-insensitive information, textiti.e., class-wise divergence and sample-wise consistency.
Our method achieves state-of-the-art performance compared with current deep metric learning approaches.
arXiv Detail & Related papers (2021-10-29T02:12:18Z) - Visual Adversarial Imitation Learning using Variational Models [60.69745540036375]
Reward function specification remains a major impediment for learning behaviors through deep reinforcement learning.
Visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.
We develop a variational model-based adversarial imitation learning algorithm.
arXiv Detail & Related papers (2021-07-16T00:15:18Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z) - Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning.
By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning.
This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.