DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with
Prototypical Representations
- URL: http://arxiv.org/abs/2110.14565v1
- Date: Wed, 27 Oct 2021 16:35:00 GMT
- Title: DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with
Prototypical Representations
- Authors: Fei Deng, Ingook Jang, Sungjin Ahn
- Abstract summary: Top-performing Model-Based Reinforcement Learning (MBRL) agents, such as Dreamer, learn the world model by reconstructing the image observations.
We propose to learn the prototypes from the recurrent states of the world model, thereby distilling temporal structures from past observations and actions into prototypes.
The resulting model, DreamerPro, successfully combines Dreamer with prototypes, making large performance gains on the DeepMind Control suite.
- Score: 18.770113681323906
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Top-performing Model-Based Reinforcement Learning (MBRL) agents, such as
Dreamer, learn the world model by reconstructing the image observations. Hence,
they often fail to discard task-irrelevant details and struggle to handle
visual distractions. To address this issue, previous work has proposed to
contrastively learn the world model, but the performance tends to be inferior
in the absence of distractions. In this paper, we seek to enhance robustness to
distractions for MBRL agents. Specifically, we consider incorporating
prototypical representations, which have yielded more accurate and robust
results than contrastive approaches in computer vision. However, it remains
elusive how prototypical representations can benefit temporal dynamics learning
in MBRL, since they treat each image independently without capturing temporal
structures. To this end, we propose to learn the prototypes from the recurrent
states of the world model, thereby distilling temporal structures from past
observations and actions into the prototypes. The resulting model, DreamerPro,
successfully combines Dreamer with prototypes, making large performance gains
on the DeepMind Control suite both in the standard setting and when there are
complex background distractions. Code available at
https://github.com/fdeng18/dreamer-pro .
Related papers
- MuDreamer: Learning Predictive World Models without Reconstruction [58.0159270859475]
We present MuDreamer, a robust reinforcement learning agent that builds upon the DreamerV3 algorithm by learning a predictive world model without the need for reconstructing input signals.
Our method achieves comparable performance on the Atari100k benchmark while benefiting from faster training.
arXiv Detail & Related papers (2024-05-23T22:09:01Z) - Aligning Modalities in Vision Large Language Models via Preference
Fine-tuning [67.62925151837675]
In this work, we frame the hallucination problem as an alignment issue, tackle it with preference tuning.
Specifically, we propose POVID to generate feedback data with AI models.
We use ground-truth instructions as the preferred response and a two-stage approach to generate dispreferred data.
In experiments across broad benchmarks, we show that we can not only reduce hallucinations, but improve model performance across standard benchmarks, outperforming prior approaches.
arXiv Detail & Related papers (2024-02-18T00:56:16Z) - HarmonyDream: Task Harmonization Inside World Models [93.07314830304193]
Model-based reinforcement learning (MBRL) holds the promise of sample-efficient learning.
We propose a simple yet effective approach, HarmonyDream, which automatically adjusts loss coefficients to maintain task harmonization.
arXiv Detail & Related papers (2023-09-30T11:38:13Z) - ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model
Reuse [59.500060790983994]
This paper introduces ZhiJian, a comprehensive and user-friendly toolbox for model reuse, utilizing the PyTorch backend.
ZhiJian presents a novel paradigm that unifies diverse perspectives on model reuse, encompassing target architecture construction with PTM, tuning target model with PTM, and PTM-based inference.
arXiv Detail & Related papers (2023-08-17T19:12:13Z) - Sense, Imagine, Act: Multimodal Perception Improves Model-Based
Reinforcement Learning for Head-to-Head Autonomous Racing [10.309579267966361]
Model-based reinforcement learning (MBRL) techniques have recently yielded promising results for real-world autonomous racing.
This paper proposes a self-supervised sensor fusion technique that combines egocentric LiDAR and RGB camera observations collected from the F1TENTH Gym.
The resulting Dreamer agent safely avoided collisions and won the most races compared to other tested baselines in zero-shot head-to-head autonomous racing.
arXiv Detail & Related papers (2023-05-08T14:49:02Z) - INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL [90.06845886194235]
We propose a modified objective for model-based reinforcement learning (RL)
We integrate a term inspired by variational empowerment into a state-space model based on mutual information.
We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds.
arXiv Detail & Related papers (2022-04-18T23:09:23Z) - Recur, Attend or Convolve? Frame Dependency Modeling Matters for
Cross-Domain Robustness in Action Recognition [0.5448283690603357]
Previous results have shown that 2D Convolutional Neural Networks (CNNs) tend to be biased toward texture rather than shape for various computer vision tasks.
This raises suspicion that large video models learn spurious correlations rather than to track relevant shapes over time.
We study the cross-domain robustness for recurrent, attention-based and convolutional video models, respectively, to investigate whether this robustness is influenced by the frame dependency modeling.
arXiv Detail & Related papers (2021-12-22T19:11:53Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.