Hardness in Markov Decision Processes: Theory and Practice
- URL: http://arxiv.org/abs/2210.13075v1
- Date: Mon, 24 Oct 2022 09:51:31 GMT
- Title: Hardness in Markov Decision Processes: Theory and Practice
- Authors: Michelangelo Conserva, Paulo Rauber
- Abstract summary: We present a systematic survey of the theory of hardness, which identifies promising research directions.
Second, we introduce Colosseum, a pioneering package that enables empirical hardness analysis.
Third, we present an empirical analysis that provides new insights into computable measures.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Meticulously analysing the empirical strengths and weaknesses of
reinforcement learning methods in hard (challenging) environments is essential
to inspire innovations and assess progress in the field. In tabular
reinforcement learning, there is no well-established standard selection of
environments to conduct such analysis, which is partially due to the lack of a
widespread understanding of the rich theory of hardness of environments. The
goal of this paper is to unlock the practical usefulness of this theory through
four main contributions. First, we present a systematic survey of the theory of
hardness, which also identifies promising research directions. Second, we
introduce Colosseum, a pioneering package that enables empirical hardness
analysis and implements a principled benchmark composed of environments that
are diverse with respect to different measures of hardness. Third, we present
an empirical analysis that provides new insights into computable measures.
Finally, we benchmark five tabular agents in our newly proposed benchmark.
While advancing the theoretical understanding of hardness in non-tabular
reinforcement learning remains essential, our contributions in the tabular
setting are intended as solid steps towards a principled non-tabular benchmark.
Accordingly, we benchmark four agents in non-tabular versions of Colosseum
environments, obtaining results that demonstrate the generality of tabular
hardness measures.
Related papers
- On the Convergence and Stability of Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning, and Online Decision Transformers [25.880499561355904]
This article provides a rigorous analysis of convergence and stability of Episodic Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning and Online Decision Transformers.
arXiv Detail & Related papers (2025-02-08T19:26:22Z) - Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.
Models may behave unreliably due to poorly explored failure modes.
causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z) - Understanding What Affects the Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence [53.51724434972605]
This paper theoretically answers the key factors that contribute to the generalization gap when the testing environment has distractors.
Our theories indicate that minimizing the representation distance between training and testing environments, which aligns with human intuition, is the most critical for the benefit of reducing the generalization gap.
arXiv Detail & Related papers (2024-02-05T03:27:52Z) - Goodhart's Law Applies to NLP's Explanation Benchmarks [57.26445915212884]
We critically examine two sets of metrics: the ERASER metrics (comprehensiveness and sufficiency) and the EVAL-X metrics.
We show that we can inflate a model's comprehensiveness and sufficiency scores dramatically without altering its predictions or explanations on in-distribution test inputs.
Our results raise doubts about the ability of current metrics to guide explainability research, underscoring the need for a broader reassessment of what precisely these metrics are intended to capture.
arXiv Detail & Related papers (2023-08-28T03:03:03Z) - Theoretical Foundations of Adversarially Robust Learning [7.589246500826111]
Current machine learning systems have been shown to be brittle against adversarial examples.
In this thesis, we explore what robustness properties can we hope to guarantee against adversarial examples.
arXiv Detail & Related papers (2023-06-13T12:20:55Z) - Learning World Models with Identifiable Factorization [39.767120163665574]
We propose IFactor to model four distinct categories of latent state variables.
Our analysis establishes block-wise identifiability of these latent variables.
We present a practical approach to learning the world model with identifiable blocks.
arXiv Detail & Related papers (2023-06-11T02:25:15Z) - On the Importance of Exploration for Generalization in Reinforcement
Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty.
Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - Exploring the Learning Difficulty of Data Theory and Measure [2.668651175000491]
This study attempts to conduct a pilot theoretical study for learning difficulty of samples.
A theoretical definition of learning difficulty is proposed on the basis of the bias-variance trade-off theory on generalization error.
Several classical weighting methods in machine learning can be well explained on account of explored properties.
arXiv Detail & Related papers (2022-05-16T02:28:12Z) - The Eigenlearning Framework: A Conservation Law Perspective on Kernel
Regression and Wide Neural Networks [1.6519302768772166]
We derive simple closed-form estimates for the test risk and other generalization metrics of kernel ridge regression.
We identify a sharp conservation law which limits the ability of KRR to learn any orthonormal basis of functions.
arXiv Detail & Related papers (2021-10-08T06:32:07Z) - Metrics and continuity in reinforcement learning [34.10996560464196]
We introduce a unified formalism for defining topologies through the lens of metrics.
We establish a hierarchy amongst these metrics and demonstrate their theoretical implications on the Markov Decision Process.
We complement our theoretical results with empirical evaluations showcasing the differences between the metrics considered.
arXiv Detail & Related papers (2021-02-02T14:30:41Z) - Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning.
The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior.
Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z) - Learning explanations that are hard to vary [75.30552491694066]
We show that averaging across examples can favor memorization and patchwork' solutions that sew together different strategies.
We then propose and experimentally validate a simple alternative algorithm based on a logical AND.
arXiv Detail & Related papers (2020-09-01T10:17:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.