Is Model Ensemble Necessary? Model-based RL via a Single Model with
Lipschitz Regularized Value Function
- URL: http://arxiv.org/abs/2302.01244v1
- Date: Thu, 2 Feb 2023 17:27:16 GMT
- Title: Is Model Ensemble Necessary? Model-based RL via a Single Model with
Lipschitz Regularized Value Function
- Authors: Ruijie Zheng, Xiyao Wang, Huazhe Xu, Furong Huang
- Abstract summary: Probabilistic dynamics model ensemble is widely used in existing model-based reinforcement learning methods.
We find that, for a value function, the stronger the Lipschitz condition is, the smaller the gap between the true dynamics-induced Bellman operators is.
- Score: 23.255250192599327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Probabilistic dynamics model ensemble is widely used in existing model-based
reinforcement learning methods as it outperforms a single dynamics model in
both asymptotic performance and sample efficiency. In this paper, we provide
both practical and theoretical insights on the empirical success of the
probabilistic dynamics model ensemble through the lens of Lipschitz continuity.
We find that, for a value function, the stronger the Lipschitz condition is,
the smaller the gap between the true dynamics- and learned dynamics-induced
Bellman operators is, thus enabling the converged value function to be closer
to the optimal value function. Hence, we hypothesize that the key functionality
of the probabilistic dynamics model ensemble is to regularize the Lipschitz
condition of the value function using generated samples. To test this
hypothesis, we devise two practical robust training mechanisms through
computing the adversarial noise and regularizing the value network's spectral
norm to directly regularize the Lipschitz condition of the value functions.
Empirical results show that combined with our mechanisms, model-based RL
algorithms with a single dynamics model outperform those with an ensemble of
probabilistic dynamics models. These findings not only support the theoretical
insight, but also provide a practical solution for developing computationally
efficient model-based RL algorithms.
Related papers
- Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - General multi-fidelity surrogate models: Framework and active learning
strategies for efficient rare event simulation [1.708673732699217]
Estimating the probability of failure for complex real-world systems is often prohibitively expensive.
This paper presents a robust multi-fidelity surrogate modeling strategy.
It is shown to be highly accurate while drastically reducing the number of high-fidelity model calls.
arXiv Detail & Related papers (2022-12-07T00:03:21Z) - Active Learning for Single Neuron Models with Lipschitz Non-Linearities [35.119032992898774]
We consider the problem of active learning for single neuron models.
We show that for a single neuron model with any Lipschitz non-linearity, strong provable approximation guarantees can be obtained.
arXiv Detail & Related papers (2022-10-24T20:55:21Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - PSD Representations for Effective Probability Models [117.35298398434628]
We show that a recently proposed class of positive semi-definite (PSD) models for non-negative functions is particularly suited to this end.
We characterize both approximation and generalization capabilities of PSD models, showing that they enjoy strong theoretical guarantees.
Our results open the way to applications of PSD models to density estimation, decision theory and inference.
arXiv Detail & Related papers (2021-06-30T15:13:39Z) - A Variational Infinite Mixture for Probabilistic Inverse Dynamics
Learning [34.90240171916858]
We develop an efficient variational Bayes inference technique for infinite mixtures of probabilistic local models.
We highlight the model's power in combining data-driven adaptation, fast prediction and the ability to deal with discontinuous functions and heteroscedastic noise.
We use the learned models for online dynamics control of a Barrett-WAM manipulator, significantly improving the trajectory tracking performance.
arXiv Detail & Related papers (2020-11-10T16:15:13Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Model Embedding Model-Based Reinforcement Learning [4.566180616886624]
Model-based reinforcement learning (MBRL) has shown its advantages in sample-efficiency over model-free reinforcement learning (MFRL)
Despite the impressive results it achieves, it still faces a trade-off between the ease of data generation and model bias.
We propose a simple and elegant model-embedding model-based reinforcement learning (MEMB) algorithm in the framework of the probabilistic reinforcement learning.
arXiv Detail & Related papers (2020-06-16T15:10:28Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.