A Reliable Representation with Bidirectional Transition Model for Visual
Reinforcement Learning Generalization
- URL: http://arxiv.org/abs/2312.01915v1
- Date: Mon, 4 Dec 2023 14:19:36 GMT
- Title: A Reliable Representation with Bidirectional Transition Model for Visual
Reinforcement Learning Generalization
- Authors: Xiaobo Hu, Youfang Lin, Yue Liu, Jinwen Wang, Shuo Wang, Hehe Fan and
Kai Lv
- Abstract summary: We introduce a Bidirectional Transition (BiT) model, which leverages the ability to bidirectionally predict environmental transitions both forward and backward to extract reliable representations.
Our model demonstrates competitive generalization performance and sample efficiency on two settings of the DeepMind Control suite.
- Score: 39.6041403130768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual reinforcement learning has proven effective in solving control tasks
with high-dimensional observations. However, extracting reliable and
generalizable representations from vision-based observations remains a central
challenge. Inspired by the human thought process, when the representation
extracted from the observation can predict the future and trace history, the
representation is reliable and accurate in comprehending the environment. Based
on this concept, we introduce a Bidirectional Transition (BiT) model, which
leverages the ability to bidirectionally predict environmental transitions both
forward and backward to extract reliable representations. Our model
demonstrates competitive generalization performance and sample efficiency on
two settings of the DeepMind Control suite. Additionally, we utilize robotic
manipulation and CARLA simulators to demonstrate the wide applicability of our
method.
Related papers
- MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement Learning [8.61492882526007]
In visual Reinforcement Learning (RL), learning from pixel-based observations poses significant challenges on sample efficiency.
We introduce MOOSS, a novel framework that leverages a temporal contrastive objective with the help of graph-based spatial-temporal masking.
Our evaluation on multiple continuous and discrete control benchmarks shows that MOOSS outperforms previous state-of-the-art visual RL methods in terms of sample efficiency.
arXiv Detail & Related papers (2024-09-02T18:57:53Z) - AdaOcc: Adaptive Forward View Transformation and Flow Modeling for 3D Occupancy and Flow Prediction [56.72301849123049]
We present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ dataset challenge at CVPR 2024.
Our innovative approach involves a dual-stage framework that enhances 3D occupancy and flow predictions by incorporating adaptive forward view transformation and flow modeling.
Our method combines regression with classification to address scale variations in different scenes, and leverages predicted flow to warp current voxel features to future frames, guided by future frame ground truth.
arXiv Detail & Related papers (2024-07-01T16:32:15Z) - Learning Interpretable Policies in Hindsight-Observable POMDPs through
Partially Supervised Reinforcement Learning [57.67629402360924]
We introduce the Partially Supervised Reinforcement Learning (PSRL) framework.
At the heart of PSRL is the fusion of both supervised and unsupervised learning.
We show that PSRL offers a potent balance, enhancing model interpretability while preserving, and often significantly outperforming, the performance benchmarks set by traditional methods.
arXiv Detail & Related papers (2024-02-14T16:23:23Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - Visual Forecasting as a Mid-level Representation for Avoidance [8.712750753534532]
The challenge of navigation in environments with dynamic objects continues to be a central issue in the study of autonomous agents.
While predictive methods hold promise, their reliance on precise state information makes them less practical for real-world implementation.
This study presents visual forecasting as an innovative alternative.
arXiv Detail & Related papers (2023-09-17T13:32:03Z) - Mutual Information Regularization for Weakly-supervised RGB-D Salient
Object Detection [33.210575826086654]
We present a weakly-supervised RGB-D salient object detection model via supervision.
We focus on effective multimodal representation learning via inter-modal mutual information regularization.
arXiv Detail & Related papers (2023-06-06T12:36:57Z) - Inverse Dynamics Pretraining Learns Good Representations for Multitask
Imitation [66.86987509942607]
We evaluate how such a paradigm should be done in imitation learning.
We consider a setting where the pretraining corpus consists of multitask demonstrations.
We argue that inverse dynamics modeling is well-suited to this setting.
arXiv Detail & Related papers (2023-05-26T14:40:46Z) - Anticipating the Unseen Discrepancy for Vision and Language Navigation [63.399180481818405]
Vision-Language Navigation requires the agent to follow natural language instructions to reach a specific target.
The large discrepancy between seen and unseen environments makes it challenging for the agent to generalize well.
We propose Unseen Discrepancy Anticipating Vision and Language Navigation (DAVIS) that learns to generalize to unseen environments via encouraging test-time visual consistency.
arXiv Detail & Related papers (2022-09-10T19:04:40Z) - Self-supervised Multi-view Stereo via Effective Co-Segmentation and
Data-Augmentation [39.95831985522991]
We propose a framework integrated with more reliable supervision guided by semantic co-segmentation and data-augmentation.
Our proposed methods achieve the state-of-the-art performance among unsupervised methods, and even compete on par with supervised methods.
arXiv Detail & Related papers (2021-04-12T11:48:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.