Metrics and continuity in reinforcement learning
- URL: http://arxiv.org/abs/2102.01514v1
- Date: Tue, 2 Feb 2021 14:30:41 GMT
- Title: Metrics and continuity in reinforcement learning
- Authors: Charline Le Lan, Marc G. Bellemare, Pablo Samuel Castro
- Abstract summary: We introduce a unified formalism for defining topologies through the lens of metrics.
We establish a hierarchy amongst these metrics and demonstrate their theoretical implications on the Markov Decision Process.
We complement our theoretical results with empirical evaluations showcasing the differences between the metrics considered.
- Score: 34.10996560464196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In most practical applications of reinforcement learning, it is untenable to
maintain direct estimates for individual states; in continuous-state systems,
it is impossible. Instead, researchers often leverage state similarity (whether
explicitly or implicitly) to build models that can generalize well from a
limited set of samples. The notion of state similarity used, and the
neighbourhoods and topologies they induce, is thus of crucial importance, as it
will directly affect the performance of the algorithms. Indeed, a number of
recent works introduce algorithms assuming the existence of "well-behaved"
neighbourhoods, but leave the full specification of such topologies for future
work. In this paper we introduce a unified formalism for defining these
topologies through the lens of metrics. We establish a hierarchy amongst these
metrics and demonstrate their theoretical implications on the Markov Decision
Process specifying the reinforcement learning problem. We complement our
theoretical results with empirical evaluations showcasing the differences
between the metrics considered.
Related papers
- Bounds on the Generalization Error in Active Learning [0.0]
We establish empirical risk principles for active learning by deriving a family of upper bounds on the generalization error.
We systematically link diverse active learning scenarios, characterized by their loss functions and hypothesis classes to their corresponding upper bounds.
Our results show that regularization techniques used to constraint the complexity of various hypothesis classes are sufficient conditions to ensure the validity of the bounds.
arXiv Detail & Related papers (2024-09-10T08:08:09Z) - Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales [54.78115855552886]
We show how to construct over-complete invariants with a Convolutional Neural Networks (CNN)-like hierarchical architecture.
With the over-completeness, discriminative features w.r.t. the task can be adaptively formed in a Neural Architecture Search (NAS)-like manner.
For robust and interpretable vision tasks at larger scales, hierarchical invariant representation can be considered as an effective alternative to traditional CNN and invariants.
arXiv Detail & Related papers (2024-02-23T16:50:07Z) - Bootstrapped Representations in Reinforcement Learning [44.49675960752777]
In reinforcement learning (RL), state representations are key to dealing with large or continuous state spaces.
We provide a theoretical characterization of the state representation learnt by temporal difference learning.
We describe the efficacy of these representations for policy evaluation, and use our theoretical analysis to design new auxiliary learning rules.
arXiv Detail & Related papers (2023-06-16T20:14:07Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Enriching Disentanglement: From Logical Definitions to Quantitative Metrics [59.12308034729482]
Disentangling the explanatory factors in complex data is a promising approach for data-efficient representation learning.
We establish relationships between logical definitions and quantitative metrics to derive theoretically grounded disentanglement metrics.
We empirically demonstrate the effectiveness of the proposed metrics by isolating different aspects of disentangled representations.
arXiv Detail & Related papers (2023-05-19T08:22:23Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - Scalable Intervention Target Estimation in Linear Models [52.60799340056917]
Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets.
This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets.
The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class.
arXiv Detail & Related papers (2021-11-15T03:16:56Z) - In Search of Robust Measures of Generalization [79.75709926309703]
We develop bounds on generalization error, optimization error, and excess risk.
When evaluated empirically, most of these bounds are numerically vacuous.
We argue that generalization measures should instead be evaluated within the framework of distributional robustness.
arXiv Detail & Related papers (2020-10-22T17:54:25Z) - A Comparison of Self-Play Algorithms Under a Generalized Framework [4.339542790745868]
The notion of self-play, albeit often cited in multiagent Reinforcement Learning, has never been grounded in a formal model.
We present a formalized framework, with clearly defined assumptions, which encapsulates the meaning of self-play.
We measure how well a subset of the captured self-play methods approximate this solution when paired with the famous PPO algorithm.
arXiv Detail & Related papers (2020-06-08T11:02:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.