Which Model To Trust: Assessing the Influence of Models on the
Performance of Reinforcement Learning Algorithms for Continuous Control Tasks
- URL: http://arxiv.org/abs/2110.13079v1
- Date: Mon, 25 Oct 2021 16:17:26 GMT
- Title: Which Model To Trust: Assessing the Influence of Models on the
Performance of Reinforcement Learning Algorithms for Continuous Control Tasks
- Authors: Giacomo Arcieri, David W\"olfle, Eleni Chatzi
- Abstract summary: It is not clear how much of the recent progress is due to improved algorithms or due to improved models.
A set of commonly adopted models is established for the purpose of model comparison.
Results reveal significant differences in model performance do exist.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The need for algorithms able to solve Reinforcement Learning (RL) problems
with few trials has motivated the advent of model-based RL methods. The
reported performance of model-based algorithms has dramatically increased
within recent years. However, it is not clear how much of the recent progress
is due to improved algorithms or due to improved models. While different
modeling options are available to choose from when applying a model-based
approach, the distinguishing traits and particular strengths of different
models are not clear. The main contribution of this work lies precisely in
assessing the model influence on the performance of RL algorithms. A set of
commonly adopted models is established for the purpose of model comparison.
These include Neural Networks (NNs), ensembles of NNs, two different
approximations of Bayesian NNs (BNNs), that is, the Concrete Dropout NN and the
Anchored Ensembling, and Gaussian Processes (GPs). The model comparison is
evaluated on a suite of continuous control benchmarking tasks. Our results
reveal that significant differences in model performance do exist. The Concrete
Dropout NN reports persistently superior performance. We summarize these
differences for the benefit of the modeler and suggest that the model choice is
tailored to the standards required by each specific application.
Related papers
- Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.
We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.
Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - How to Fine-tune the Model: Unified Model Shift and Model Bias Policy
Optimization [13.440645736306267]
This paper develops an algorithm for model-based reinforcement learning.
It unifies model shift and model bias and then formulates a fine-tuning process.
It achieves state-of-the-art performance on several challenging benchmark tasks.
arXiv Detail & Related papers (2023-09-22T07:27:32Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Reinforcement Learning based dynamic weighing of Ensemble Models for
Time Series Forecasting [0.8399688944263843]
It is known that if models selected for data modelling are distinct (linear/non-linear, static/dynamic) and independent (minimally correlated) models, the accuracy of the predictions is improved.
Various approaches suggested in the literature to weigh the ensemble models use a static set of weights.
To address this issue, a Reinforcement Learning (RL) approach to dynamically assign and update weights of each of the models at different time instants.
arXiv Detail & Related papers (2020-08-20T10:40:42Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.