Analyzing Reinforcement Learning Benchmarks with Random Weight Guessing
- URL: http://arxiv.org/abs/2004.07707v1
- Date: Thu, 16 Apr 2020 15:32:52 GMT
- Title: Analyzing Reinforcement Learning Benchmarks with Random Weight Guessing
- Authors: Declan Oller, Tobias Glasmachers, Giuseppe Cuccu
- Abstract summary: A large number of policy networks are generated by randomly guessing their parameters, and then evaluated on the benchmark task.
We show that this approach isolates the environment complexity, highlights specific types of challenges, and provides a proper foundation for the statistical analysis of the task's difficulty.
We test our approach on a variety of classic control benchmarks from the OpenAI Gym, where we show that small untrained networks can provide a robust baseline for a variety of tasks.
- Score: 2.5137859989323537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel method for analyzing and visualizing the complexity of
standard reinforcement learning (RL) benchmarks based on score distributions. A
large number of policy networks are generated by randomly guessing their
parameters, and then evaluated on the benchmark task; the study of their
aggregated results provide insights into the benchmark complexity. Our method
guarantees objectivity of evaluation by sidestepping learning altogether: the
policy network parameters are generated using Random Weight Guessing (RWG),
making our method agnostic to (i) the classic RL setup, (ii) any learning
algorithm, and (iii) hyperparameter tuning. We show that this approach isolates
the environment complexity, highlights specific types of challenges, and
provides a proper foundation for the statistical analysis of the task's
difficulty. We test our approach on a variety of classic control benchmarks
from the OpenAI Gym, where we show that small untrained networks can provide a
robust baseline for a variety of tasks. The networks generated often show good
performance even without gradual learning, incidentally highlighting the
triviality of a few popular benchmarks.
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - Reinforcement Replaces Supervision: Query focused Summarization using
Deep Reinforcement Learning [43.123290672073814]
We deal with systems that generate summaries from document(s) based on a query.
Motivated by the insight that Reinforcement Learning (RL) provides a generalization to Supervised Learning (SL) for Natural Language Generation, we use an RL-based approach for this task.
We develop multiple Policy Gradient networks, trained on various reward signals: ROUGE, BLEU, and Semantic Similarity.
arXiv Detail & Related papers (2023-11-29T10:38:16Z) - Single-Trajectory Distributionally Robust Reinforcement Learning [21.955807398493334]
We propose Distributionally Robust RL (DRRL) to enhance performance across a range of environments.
Existing DRRL algorithms are either model-based or fail to learn from a single sample trajectory.
We design a first fully model-free DRRL algorithm, called distributionally robust Q-learning with single trajectory (DRQ)
arXiv Detail & Related papers (2023-01-27T14:08:09Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Why So Pessimistic? Estimating Uncertainties for Offline RL through
Ensembles, and Why Their Independence Matters [35.17151863463472]
We take a renewed look at how ensembles of $Q$-functions can be leveraged as the primary source of pessimism for offline reinforcement learning (RL)
We propose MSG, a practical offline RL algorithm that trains an ensemble of $Q$-functions with independently computed targets based on completely separate networks.
Our experiments on the popular D4RL and RL Unplugged offline RL benchmarks demonstrate that MSG with deep ensembles surpasses highly well-tuned state-of-the-art methods by a wide margin.
arXiv Detail & Related papers (2022-05-27T01:30:12Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Transductive Few-Shot Learning: Clustering is All You Need? [31.21306826132773]
We investigate a general formulation for transive few-shot learning, which integrates prototype-based objectives.
We find that our method yields competitive performances, in term of accuracy and optimization, while scaling up to large problems.
Surprisingly, we find that our general model already achieve competitive performances in comparison to the state-of-the-art learning.
arXiv Detail & Related papers (2021-06-16T16:14:01Z) - Policy Information Capacity: Information-Theoretic Measure for Task
Complexity in Deep Reinforcement Learning [83.66080019570461]
We propose two environment-agnostic, algorithm-agnostic quantitative metrics for task difficulty.
We show that these metrics have higher correlations with normalized task solvability scores than a variety of alternatives.
These metrics can also be used for fast and compute-efficient optimizations of key design parameters.
arXiv Detail & Related papers (2021-03-23T17:49:50Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.