Related papers: Accelerating Representation Learning with View-Consistent Dynamics in Data-Efficient Reinforcement Learning

Accelerating Representation Learning with View-Consistent Dynamics in Data-Efficient Reinforcement Learning

URL: http://arxiv.org/abs/2201.07016v1
Date: Tue, 18 Jan 2022 14:28:30 GMT
Title: Accelerating Representation Learning with View-Consistent Dynamics in Data-Efficient Reinforcement Learning
Authors: Tao Huang, Jiachen Wang, Xiao Chen
Abstract summary: We propose to accelerate state representation learning by enforcing view-consistency on the dynamics. We introduce a formalism of Multi-view Markov Decision Process (MMDP) that incorporates multiple views of the state. Following the structure of MMDP, our method, View-Consistent Dynamics (VCD), learns state representations by training a view-consistent dynamics model in the latent space.
Score: 12.485293708638292
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning informative representations from image-based observations is of fundamental concern in deep Reinforcement Learning (RL). However, data-inefficiency remains a significant barrier to this objective. To overcome this obstacle, we propose to accelerate state representation learning by enforcing view-consistency on the dynamics. Firstly, we introduce a formalism of Multi-view Markov Decision Process (MMDP) that incorporates multiple views of the state. Following the structure of MMDP, our method, View-Consistent Dynamics (VCD), learns state representations by training a view-consistent dynamics model in the latent space, where views are generated by applying data augmentation to states. Empirical evaluation on DeepMind Control Suite and Atari-100k demonstrates VCD to be the SoTA data-efficient algorithm on visual control tasks.

Related papers

MInCo: Mitigating Information Conflicts in Distracted Visual Model-based Reinforcement Learning [29.087810262499634]
Existing visual model-based reinforcement learning (MBRL) algorithms with observation reconstruction often suffer from information conflicts. We present MInCo, which mitigates information conflicts by leveraging negative-free contrastive learning. We evaluate our method on several robotic control tasks with dynamic background distractions.
arXiv Detail & Related papers (2025-04-05T12:57:31Z)
Mitigating Visual Knowledge Forgetting in MLLM Instruction-tuning via Modality-decoupled Gradient Descent [72.1517476116743]
Recent MLLMs have shown emerging visual understanding and reasoning abilities after being pre-trained on large-scale multimodal datasets. Existing approaches, such as direct fine-tuning and continual learning methods, fail to explicitly address this issue. We introduce a novel perspective leveraging effective rank to quantify the degradation of visual representation forgetting. We propose a modality-decoupled gradient descent (MDGD) method that regulates gradient updates to maintain the effective rank of visual representations.
arXiv Detail & Related papers (2025-02-17T12:26:34Z)
Learning Fused State Representations for Control from Multi-View Observations [19.862313754887648]
Multi-view Reinforcement Learning (MVRL) seeks to provide agents with multi-view observations, enabling them to perceive environment with greater effectiveness and precision. Recent advancements in MVRL focus on extracting latent representations from multiview observations and leveraging them in control tasks. We propose Multi-view Fusion State for Control (MFSC), firstly incorporating bisimulation metric learning into MVRL to learn task-relevant representations.
arXiv Detail & Related papers (2025-02-03T12:46:02Z)
Dynamical-VAE-based Hindsight to Learn the Causal Dynamics of Factored-POMDPs [9.662551514840388]
We introduce a Dynamical Variational Auto-Encoder (DVAE) designed to learn causal Markovian dynamics from offline trajectories. Our method employs an extended hindsight framework that integrates past, current, and multi-step future information. Empirical results reveal that this approach uncovers the causal graph governing hidden state transitions more effectively than history-based and typical hindsight-based models.
arXiv Detail & Related papers (2024-11-12T14:27:45Z)
MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement Learning [8.61492882526007]
In visual Reinforcement Learning (RL), learning from pixel-based observations poses significant challenges on sample efficiency. We introduce MOOSS, a novel framework that leverages a temporal contrastive objective with the help of graph-based spatial-temporal masking. Our evaluation on multiple continuous and discrete control benchmarks shows that MOOSS outperforms previous state-of-the-art visual RL methods in terms of sample efficiency.
arXiv Detail & Related papers (2024-09-02T18:57:53Z)
Intrinsic Dynamics-Driven Generalizable Scene Representations for Vision-Oriented Decision-Making Applications [0.21051221444478305]
How to improve the ability of scene representation is a key issue in vision-oriented decision-making applications. We propose an intrinsic dynamics-driven representation learning method with sequence models in visual reinforcement learning.
arXiv Detail & Related papers (2024-05-30T06:31:03Z)
Learning Interpretable Policies in Hindsight-Observable POMDPs through Partially Supervised Reinforcement Learning [57.67629402360924]
We introduce the Partially Supervised Reinforcement Learning (PSRL) framework. At the heart of PSRL is the fusion of both supervised and unsupervised learning. We show that PSRL offers a potent balance, enhancing model interpretability while preserving, and often significantly outperforming, the performance benchmarks set by traditional methods.
arXiv Detail & Related papers (2024-02-14T16:23:23Z)
Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance. This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z)
VIBR: Learning View-Invariant Value Functions for Robust Visual Control [3.2307366446033945]
VIBR (View-Invariant Bellman Residuals) is a method that combines multi-view training and invariant prediction to reduce out-of-distribution gap for RL based visuomotor control. We show that VIBR outperforms existing methods on complex visuo-motor control environment with high visual perturbation.
arXiv Detail & Related papers (2023-06-14T14:37:34Z)
Accelerating exploration and representation learning with offline pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset. We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z)
Learning Generalizable Representations for Reinforcement Learning via Adaptive Meta-learner of Behavioral Similarities [43.327357653393015]
We propose a novel meta-learner-based framework for representation learning regarding behavioral similarities for reinforcement learning. We empirically demonstrate that our proposed framework outperforms state-of-the-art baselines on several benchmarks.
arXiv Detail & Related papers (2022-12-26T11:11:23Z)
Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations [58.758928936316785]
offline reinforcement learning from visual observations with continuous action spaces remains under-explored. We show that modifications to two popular vision-based online reinforcement learning algorithms suffice to outperform existing offline RL methods.
arXiv Detail & Related papers (2022-06-09T22:08:47Z)
Which Mutual-Information Representation Learning Objectives are Sufficient for Control? [80.2534918595143]
Mutual information provides an appealing formalism for learning representations of data. This paper formalizes the sufficiency of a state representation for learning and representing the optimal policy. Surprisingly, we find that two of these objectives can yield insufficient representations given mild and common assumptions on the structure of the MDP.
arXiv Detail & Related papers (2021-06-14T10:12:34Z)
Self-supervised Video Object Segmentation [76.83567326586162]
The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a.k.a. dense tracking) We make the following contributions: (i) we propose to improve the existing self-supervised approach, with a simple, yet more effective memory mechanism for long-term correspondence matching; (ii) by augmenting the self-supervised approach with an online adaptation module, our method successfully alleviates tracker drifts caused by spatial-temporal discontinuity; (iv) we demonstrate state-of-the-art results among the self-supervised approaches on DAVIS-2017 and YouTube
arXiv Detail & Related papers (2020-06-22T17:55:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.