MuDreamer: Learning Predictive World Models without Reconstruction
- URL: http://arxiv.org/abs/2405.15083v1
- Date: Thu, 23 May 2024 22:09:01 GMT
- Title: MuDreamer: Learning Predictive World Models without Reconstruction
- Authors: Maxime Burchi, Radu Timofte,
- Abstract summary: We present MuDreamer, a robust reinforcement learning agent that builds upon the DreamerV3 algorithm by learning a predictive world model without the need for reconstructing input signals.
Our method achieves comparable performance on the Atari100k benchmark while benefiting from faster training.
- Score: 58.0159270859475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The DreamerV3 agent recently demonstrated state-of-the-art performance in diverse domains, learning powerful world models in latent space using a pixel reconstruction loss. However, while the reconstruction loss is essential to Dreamer's performance, it also necessitates modeling unnecessary information. Consequently, Dreamer sometimes fails to perceive crucial elements which are necessary for task-solving when visual distractions are present in the observation, significantly limiting its potential. In this paper, we present MuDreamer, a robust reinforcement learning agent that builds upon the DreamerV3 algorithm by learning a predictive world model without the need for reconstructing input signals. Rather than relying on pixel reconstruction, hidden representations are instead learned by predicting the environment value function and previously selected actions. Similar to predictive self-supervised methods for images, we find that the use of batch normalization is crucial to prevent learning collapse. We also study the effect of KL balancing between model posterior and prior losses on convergence speed and learning stability. We evaluate MuDreamer on the commonly used DeepMind Visual Control Suite and demonstrate stronger robustness to visual distractions compared to DreamerV3 and other reconstruction-free approaches, replacing the environment background with task-irrelevant real-world videos. Our method also achieves comparable performance on the Atari100k benchmark while benefiting from faster training.
Related papers
- CURLing the Dream: Contrastive Representations for World Modeling in Reinforcement Learning [0.22615818641180724]
Curled-Dreamer is a novel reinforcement learning algorithm that integrates contrastive learning into the DreamerV3 framework.
Our experiments demonstrate that Curled-Dreamer consistently outperforms state-of-the-art algorithms.
arXiv Detail & Related papers (2024-08-11T14:13:22Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - DreamingV2: Reinforcement Learning with Discrete World Models without
Reconstruction [14.950054143767824]
The present paper proposes a novel reinforcement learning method with world models, DreamingV2.
DreamingV2 is a collaborative extension of DreamerV2 and Dreaming.
We believe DreamingV2 will be a reliable solution for robot learning since its discrete representation is suitable to describe discontinuous environments.
arXiv Detail & Related papers (2022-03-01T14:44:15Z) - Robust Robotic Control from Pixels using Contrastive Recurrent
State-Space Models [8.22669535053079]
We study how to learn world models in unconstrained environments over high-dimensional observation spaces such as images.
One source of difficulty is the presence of irrelevant but hard-to-model background distractions.
We learn a recurrent latent dynamics model which contrastively predicts the next observation.
This simple model leads to surprisingly robust robotic control even with simultaneous camera, background, and color distractions.
arXiv Detail & Related papers (2021-12-02T12:15:25Z) - DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with
Prototypical Representations [18.770113681323906]
Top-performing Model-Based Reinforcement Learning (MBRL) agents, such as Dreamer, learn the world model by reconstructing the image observations.
We propose to learn the prototypes from the recurrent states of the world model, thereby distilling temporal structures from past observations and actions into prototypes.
The resulting model, DreamerPro, successfully combines Dreamer with prototypes, making large performance gains on the DeepMind Control suite.
arXiv Detail & Related papers (2021-10-27T16:35:00Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Dreaming: Model-based Reinforcement Learning by Latent Imagination
without Reconstruction [14.950054143767824]
We propose a decoder-free extension of Dreamer, a leading model-based reinforcement learning (MBRL) method from pixels.
We derive a likelihood-free and InfoMax objective of contrastive learning from the evidence lower bound of Dreamer.
In comparison to Dreamer and other recent model-free reinforcement learning methods, our newly devised Dreamer with InfoMax and without generative decoder (Dreaming) achieves the best scores on 5 difficult simulated robotics tasks.
arXiv Detail & Related papers (2020-07-29T00:14:40Z) - Auto-Rectify Network for Unsupervised Indoor Depth Estimation [119.82412041164372]
We establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth.
We propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning.
Our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.
arXiv Detail & Related papers (2020-06-04T08:59:17Z) - Mutual Information Maximization for Robust Plannable Representations [82.83676853746742]
We present MIRO, an information theoretic representational learning algorithm for model-based reinforcement learning.
We show that our approach is more robust than reconstruction objectives in the presence of distractors and cluttered scenes.
arXiv Detail & Related papers (2020-05-16T21:58:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.