Predictive Experience Replay for Continual Visual Control and
Forecasting
- URL: http://arxiv.org/abs/2303.06572v1
- Date: Sun, 12 Mar 2023 05:08:03 GMT
- Title: Predictive Experience Replay for Continual Visual Control and
Forecasting
- Authors: Wendong Zhang, Geng Chen, Xiangming Zhu, Siyu Gao, Yunbo Wang,
Xiaokang Yang
- Abstract summary: We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
- Score: 62.06183102362871
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning physical dynamics in a series of non-stationary environments is a
challenging but essential task for model-based reinforcement learning (MBRL)
with visual inputs. It requires the agent to consistently adapt to novel tasks
without forgetting previous knowledge. In this paper, we present a new
continual learning approach for visual dynamics modeling and explore its
efficacy in visual control and forecasting. The key assumption is that an ideal
world model can provide a non-forgetting environment simulator, which enables
the agent to optimize the policy in a multi-task learning manner based on the
imagined trajectories from the world model. To this end, we first propose the
mixture world model that learns task-specific dynamics priors with a mixture of
Gaussians, and then introduce a new training strategy to overcome catastrophic
forgetting, which we call predictive experience replay. Finally, we extend
these methods to continual RL and further address the value estimation problems
with the exploratory-conservative behavior learning approach. Our model
remarkably outperforms the naive combinations of existing continual learning
and visual RL algorithms on DeepMind Control and Meta-World benchmarks with
continual visual control tasks. It is also shown to effectively alleviate the
forgetting of spatiotemporal dynamics in video prediction datasets with
evolving domains.
Related papers
- ReCoRe: Regularized Contrastive Representation Learning of World Model [21.29132219042405]
We present a world model that learns invariant features using contrastive unsupervised learning and an intervention-invariant regularizer.
Our method outperforms current state-of-the-art model-based and model-free RL methods and significantly improves on out-of-distribution point navigation tasks evaluated on the iGibson benchmark.
arXiv Detail & Related papers (2023-12-14T15:53:07Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Dream to Explore: Adaptive Simulations for Autonomous Systems [3.0664963196464448]
We tackle the problem of learning to control dynamical systems by applying Bayesian nonparametric methods.
By employing Gaussian processes to discover latent world dynamics, we mitigate common data efficiency issues observed in reinforcement learning.
Our algorithm jointly learns a world model and policy by optimizing a variational lower bound of a log-likelihood.
arXiv Detail & Related papers (2021-10-27T04:27:28Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Model-based Meta Reinforcement Learning using Graph Structured Surrogate
Models [40.08137765886609]
We show that our model, called a graph structured surrogate model (GSSM), outperforms state-of-the-art methods in predicting environment dynamics.
Our approach is able to obtain high returns, while allowing fast execution during deployment by avoiding test time policy gradient optimization.
arXiv Detail & Related papers (2021-02-16T17:21:55Z) - Planning from Pixels using Inverse Dynamics Models [44.16528631970381]
We propose a novel way to learn latent world models by learning to predict sequences of future actions conditioned on task completion.
We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.
arXiv Detail & Related papers (2020-12-04T06:07:36Z) - Trajectory-wise Multiple Choice Learning for Dynamics Generalization in
Reinforcement Learning [137.39196753245105]
We present a new model-based reinforcement learning algorithm that learns a multi-headed dynamics model for dynamics generalization.
We incorporate context learning, which encodes dynamics-specific information from past experiences into the context latent vector.
Our method exhibits superior zero-shot generalization performance across a variety of control tasks, compared to state-of-the-art RL methods.
arXiv Detail & Related papers (2020-10-26T03:20:42Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Context-aware Dynamics Model for Generalization in Model-Based
Reinforcement Learning [124.9856253431878]
We decompose the task of learning a global dynamics model into two stages: (a) learning a context latent vector that captures the local dynamics, then (b) predicting the next state conditioned on it.
In order to encode dynamics-specific information into the context latent vector, we introduce a novel loss function that encourages the context latent vector to be useful for predicting both forward and backward dynamics.
The proposed method achieves superior generalization ability across various simulated robotics and control tasks, compared to existing RL schemes.
arXiv Detail & Related papers (2020-05-14T08:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.