Related papers: Hardness in Markov Decision Processes: Theory and Practice

Hardness in Markov Decision Processes: Theory and Practice

URL: http://arxiv.org/abs/2210.13075v1
Date: Mon, 24 Oct 2022 09:51:31 GMT
Title: Hardness in Markov Decision Processes: Theory and Practice
Authors: Michelangelo Conserva, Paulo Rauber
Abstract summary: We present a systematic survey of the theory of hardness, which identifies promising research directions. Second, we introduce Colosseum, a pioneering package that enables empirical hardness analysis. Third, we present an empirical analysis that provides new insights into computable measures.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Meticulously analysing the empirical strengths and weaknesses of reinforcement learning methods in hard (challenging) environments is essential to inspire innovations and assess progress in the field. In tabular reinforcement learning, there is no well-established standard selection of environments to conduct such analysis, which is partially due to the lack of a widespread understanding of the rich theory of hardness of environments. The goal of this paper is to unlock the practical usefulness of this theory through four main contributions. First, we present a systematic survey of the theory of hardness, which also identifies promising research directions. Second, we introduce Colosseum, a pioneering package that enables empirical hardness analysis and implements a principled benchmark composed of environments that are diverse with respect to different measures of hardness. Third, we present an empirical analysis that provides new insights into computable measures. Finally, we benchmark five tabular agents in our newly proposed benchmark. While advancing the theoretical understanding of hardness in non-tabular reinforcement learning remains essential, our contributions in the tabular setting are intended as solid steps towards a principled non-tabular benchmark. Accordingly, we benchmark four agents in non-tabular versions of Colosseum environments, obtaining results that demonstrate the generality of tabular hardness measures.

Related papers

On the Convergence and Stability of Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning, and Online Decision Transformers [25.880499561355904]
This article provides a rigorous analysis of convergence and stability of Episodic Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning and Online Decision Transformers.
arXiv Detail & Related papers (2025-02-08T19:26:22Z)
Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage. Models may behave unreliably due to poorly explored failure modes. causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z)
Understanding What Affects the Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence [53.51724434972605]
This paper theoretically answers the key factors that contribute to the generalization gap when the testing environment has distractors. Our theories indicate that minimizing the representation distance between training and testing environments, which aligns with human intuition, is the most critical for the benefit of reducing the generalization gap.
arXiv Detail & Related papers (2024-02-05T03:27:52Z)
Goodhart's Law Applies to NLP's Explanation Benchmarks [57.26445915212884]
We critically examine two sets of metrics: the ERASER metrics (comprehensiveness and sufficiency) and the EVAL-X metrics. We show that we can inflate a model's comprehensiveness and sufficiency scores dramatically without altering its predictions or explanations on in-distribution test inputs. Our results raise doubts about the ability of current metrics to guide explainability research, underscoring the need for a broader reassessment of what precisely these metrics are intended to capture.
arXiv Detail & Related papers (2023-08-28T03:03:03Z)
Theoretical Foundations of Adversarially Robust Learning [7.589246500826111]
Current machine learning systems have been shown to be brittle against adversarial examples. In this thesis, we explore what robustness properties can we hope to guarantee against adversarial examples.
arXiv Detail & Related papers (2023-06-13T12:20:55Z)
Learning World Models with Identifiable Factorization [39.767120163665574]
We propose IFactor to model four distinct categories of latent state variables. Our analysis establishes block-wise identifiability of these latent variables. We present a practical approach to learning the world model with identifiable blocks.
arXiv Detail & Related papers (2023-06-11T02:25:15Z)
On the Importance of Exploration for Generalization in Reinforcement Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty. Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z)
Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations. Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z)
Exploring the Learning Difficulty of Data Theory and Measure [2.668651175000491]
This study attempts to conduct a pilot theoretical study for learning difficulty of samples. A theoretical definition of learning difficulty is proposed on the basis of the bias-variance trade-off theory on generalization error. Several classical weighting methods in machine learning can be well explained on account of explored properties.
arXiv Detail & Related papers (2022-05-16T02:28:12Z)
The Eigenlearning Framework: A Conservation Law Perspective on Kernel Regression and Wide Neural Networks [1.6519302768772166]
We derive simple closed-form estimates for the test risk and other generalization metrics of kernel ridge regression. We identify a sharp conservation law which limits the ability of KRR to learn any orthonormal basis of functions.
arXiv Detail & Related papers (2021-10-08T06:32:07Z)
Metrics and continuity in reinforcement learning [34.10996560464196]
We introduce a unified formalism for defining topologies through the lens of metrics. We establish a hierarchy amongst these metrics and demonstrate their theoretical implications on the Markov Decision Process. We complement our theoretical results with empirical evaluations showcasing the differences between the metrics considered.
arXiv Detail & Related papers (2021-02-02T14:30:41Z)
Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning. The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior. Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z)
Learning explanations that are hard to vary [75.30552491694066]
We show that averaging across examples can favor memorization and patchwork' solutions that sew together different strategies. We then propose and experimentally validate a simple alternative algorithm based on a logical AND.
arXiv Detail & Related papers (2020-09-01T10:17:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.