Related papers: DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations

DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations

URL: http://arxiv.org/abs/2110.14565v1
Date: Wed, 27 Oct 2021 16:35:00 GMT
Title: DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations
Authors: Fei Deng, Ingook Jang, Sungjin Ahn
Abstract summary: Top-performing Model-Based Reinforcement Learning (MBRL) agents, such as Dreamer, learn the world model by reconstructing the image observations. We propose to learn the prototypes from the recurrent states of the world model, thereby distilling temporal structures from past observations and actions into prototypes. The resulting model, DreamerPro, successfully combines Dreamer with prototypes, making large performance gains on the DeepMind Control suite.
Score: 18.770113681323906
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Top-performing Model-Based Reinforcement Learning (MBRL) agents, such as Dreamer, learn the world model by reconstructing the image observations. Hence, they often fail to discard task-irrelevant details and struggle to handle visual distractions. To address this issue, previous work has proposed to contrastively learn the world model, but the performance tends to be inferior in the absence of distractions. In this paper, we seek to enhance robustness to distractions for MBRL agents. Specifically, we consider incorporating prototypical representations, which have yielded more accurate and robust results than contrastive approaches in computer vision. However, it remains elusive how prototypical representations can benefit temporal dynamics learning in MBRL, since they treat each image independently without capturing temporal structures. To this end, we propose to learn the prototypes from the recurrent states of the world model, thereby distilling temporal structures from past observations and actions into the prototypes. The resulting model, DreamerPro, successfully combines Dreamer with prototypes, making large performance gains on the DeepMind Control suite both in the standard setting and when there are complex background distractions. Code available at https://github.com/fdeng18/dreamer-pro .

Related papers

Dream to Generalize: Zero-Shot Model-Based Reinforcement Learning for Unseen Visual Distractions [14.137070712516005]
We propose a novel self-supervised method, Dream to Generalize (Dr. G), for zero-shot model-based reinforcement learning (MBRL)<n>Dr. G trains its encoder and world model with dual contrastive learning which efficiently captures task-relevant features among multi-view data augmentations.<n>We also introduce a recurrent state inverse dynamics model that helps the world model to better understand the temporal structure.
arXiv Detail & Related papers (2025-06-05T00:39:03Z)
MuDreamer: Learning Predictive World Models without Reconstruction [58.0159270859475]
We present MuDreamer, a robust reinforcement learning agent that builds upon the DreamerV3 algorithm by learning a predictive world model without the need for reconstructing input signals. Our method achieves comparable performance on the Atari100k benchmark while benefiting from faster training.
arXiv Detail & Related papers (2024-05-23T22:09:01Z)
Aligning Modalities in Vision Large Language Models via Preference Fine-tuning [67.62925151837675]
In this work, we frame the hallucination problem as an alignment issue, tackle it with preference tuning. Specifically, we propose POVID to generate feedback data with AI models. We use ground-truth instructions as the preferred response and a two-stage approach to generate dispreferred data. In experiments across broad benchmarks, we show that we can not only reduce hallucinations, but improve model performance across standard benchmarks, outperforming prior approaches.
arXiv Detail & Related papers (2024-02-18T00:56:16Z)
HarmonyDream: Task Harmonization Inside World Models [93.07314830304193]
Model-based reinforcement learning (MBRL) holds the promise of sample-efficient learning. We propose a simple yet effective approach, HarmonyDream, which automatically adjusts loss coefficients to maintain task harmonization.
arXiv Detail & Related papers (2023-09-30T11:38:13Z)
ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend. ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z)
Sense, Imagine, Act: Multimodal Perception Improves Model-Based Reinforcement Learning for Head-to-Head Autonomous Racing [10.309579267966361]
Model-based reinforcement learning (MBRL) techniques have recently yielded promising results for real-world autonomous racing. This paper proposes a self-supervised sensor fusion technique that combines egocentric LiDAR and RGB camera observations collected from the F1TENTH Gym. The resulting Dreamer agent safely avoided collisions and won the most races compared to other tested baselines in zero-shot head-to-head autonomous racing.
arXiv Detail & Related papers (2023-05-08T14:49:02Z)
INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL [90.06845886194235]
We propose a modified objective for model-based reinforcement learning (RL) We integrate a term inspired by variational empowerment into a state-space model based on mutual information. We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds.
arXiv Detail & Related papers (2022-04-18T23:09:23Z)
Recur, Attend or Convolve? Frame Dependency Modeling Matters for Cross-Domain Robustness in Action Recognition [0.5448283690603357]
Previous results have shown that 2D Convolutional Neural Networks (CNNs) tend to be biased toward texture rather than shape for various computer vision tasks. This raises suspicion that large video models learn spurious correlations rather than to track relevant shapes over time. We study the cross-domain robustness for recurrent, attention-based and convolutional video models, respectively, to investigate whether this robustness is influenced by the frame dependency modeling.
arXiv Detail & Related papers (2021-12-22T19:11:53Z)
Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms. We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance. We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.