Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of
Gaussian Processes
- URL: http://arxiv.org/abs/2006.11441v3
- Date: Mon, 30 Nov 2020 17:06:14 GMT
- Title: Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of
Gaussian Processes
- Authors: Mengdi Xu, Wenhao Ding, Jiacheng Zhu, Zuxin Liu, Baiming Chen, Ding
Zhao
- Abstract summary: This paper proposes a continual online model-based reinforcement learning approach.
It does not require pre-training to solve task-agnostic problems with unknown task boundaries.
In experiments, our approach outperforms alternative methods in non-stationary tasks.
- Score: 25.513074215377696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continuously learning to solve unseen tasks with limited experience has been
extensively pursued in meta-learning and continual learning, but with
restricted assumptions such as accessible task distributions, independently and
identically distributed tasks, and clear task delineations. However, real-world
physical tasks frequently violate these assumptions, resulting in performance
degradation. This paper proposes a continual online model-based reinforcement
learning approach that does not require pre-training to solve task-agnostic
problems with unknown task boundaries. We maintain a mixture of experts to
handle nonstationarity, and represent each different type of dynamics with a
Gaussian Process to efficiently leverage collected data and expressively model
uncertainty. We propose a transition prior to account for the temporal
dependencies in streaming data and update the mixture online via sequential
variational inference. Our approach reliably handles the task distribution
shift by generating new models for never-before-seen dynamics and reusing old
models for previously seen dynamics. In experiments, our approach outperforms
alternative methods in non-stationary tasks, including classic control with
changing dynamics and decision making in different driving scenarios.
Related papers
- Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Self-Supervised Reinforcement Learning that Transfers using Random
Features [41.00256493388967]
We propose a self-supervised reinforcement learning method that enables the transfer of behaviors across tasks with different rewards.
Our method is self-supervised in that it can be trained on offline datasets without reward labels, but can then be quickly deployed on new tasks.
arXiv Detail & Related papers (2023-05-26T20:37:06Z) - Predictive Experience Replay for Continual Visual Control and
Forecasting [62.06183102362871]
We present a new continual learning approach for visual dynamics modeling and explore its efficacy in visual control and forecasting.
We first propose the mixture world model that learns task-specific dynamics priors with a mixture of Gaussians, and then introduce a new training strategy to overcome catastrophic forgetting.
Our model remarkably outperforms the naive combinations of existing continual learning and visual RL algorithms on DeepMind Control and Meta-World benchmarks with continual visual control tasks.
arXiv Detail & Related papers (2023-03-12T05:08:03Z) - Learning Goal-Conditioned Policies Offline with Self-Supervised Reward
Shaping [94.89128390954572]
We propose a novel self-supervised learning phase on the pre-collected dataset to understand the structure and the dynamics of the model.
We evaluate our method on three continuous control tasks, and show that our model significantly outperforms existing approaches.
arXiv Detail & Related papers (2023-01-05T15:07:10Z) - A Dirichlet Process Mixture of Robust Task Models for Scalable Lifelong
Reinforcement Learning [11.076005074172516]
reinforcement learning algorithms can easily encounter catastrophic forgetting or interference when faced with lifelong streaming information.
We propose a scalable lifelong RL method that dynamically expands the network capacity to accommodate new knowledge.
We show that our method successfully facilitates scalable lifelong RL and outperforms relevant existing methods.
arXiv Detail & Related papers (2022-05-22T09:48:41Z) - Mixture of basis for interpretable continual learning with distribution
shifts [1.6114012813668934]
Continual learning in environments with shifting data distributions is a challenging problem with several real-world applications.
We propose a novel approach called mixture of Basismodels (MoB) for addressing this problem setting.
arXiv Detail & Related papers (2022-01-05T22:53:15Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Online Constrained Model-based Reinforcement Learning [13.362455603441552]
Key requirement is the ability to handle continuous state and action spaces while remaining within a limited time and resource budget.
We propose a model based approach that combines Gaussian Process regression and Receding Horizon Control.
We test our approach on a cart pole swing-up environment and demonstrate the benefits of online learning on an autonomous racing task.
arXiv Detail & Related papers (2020-04-07T15:51:34Z) - A Neural Dirichlet Process Mixture Model for Task-Free Continual
Learning [48.87397222244402]
We propose an expansion-based approach for task-free continual learning.
Our model successfully performs task-free continual learning for both discriminative and generative tasks.
arXiv Detail & Related papers (2020-01-03T02:07:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.