On the Complexity of Adversarial Decision Making
- URL: http://arxiv.org/abs/2206.13063v1
- Date: Mon, 27 Jun 2022 06:20:37 GMT
- Title: On the Complexity of Adversarial Decision Making
- Authors: Dylan J. Foster and Alexander Rakhlin and Ayush Sekhari and Karthik
Sridharan
- Abstract summary: We show that the Decision-Estimation Coefficient is necessary and sufficient to obtain low regret for adversarial decision making.
We provide new structural results that connect the Decision-Estimation Coefficient to variants of other well-known complexity measures.
- Score: 101.14158787665252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A central problem in online learning and decision making -- from bandits to
reinforcement learning -- is to understand what modeling assumptions lead to
sample-efficient learning guarantees. We consider a general adversarial
decision making framework that encompasses (structured) bandit problems with
adversarial rewards and reinforcement learning problems with adversarial
dynamics. Our main result is to show -- via new upper and lower bounds -- that
the Decision-Estimation Coefficient, a complexity measure introduced by Foster
et al. in the stochastic counterpart to our setting, is necessary and
sufficient to obtain low regret for adversarial decision making. However,
compared to the stochastic setting, one must apply the Decision-Estimation
Coefficient to the convex hull of the class of models (or, hypotheses) under
consideration. This establishes that the price of accommodating adversarial
rewards or dynamics is governed by the behavior of the model class under
convexification, and recovers a number of existing results -- both positive and
negative. En route to obtaining these guarantees, we provide new structural
results that connect the Decision-Estimation Coefficient to variants of other
well-known complexity measures, including the Information Ratio of Russo and
Van Roy and the Exploration-by-Optimization objective of Lattimore and
Gy\"{o}rgy.
Related papers
- Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability [71.82666334363174]
We develop a unified framework for lower bound methods in statistical estimation and interactive decision making.
We introduce a novel measure, decision dimension, which facilitates the complexity of new lower bounds for interactive decision making.
arXiv Detail & Related papers (2024-10-07T15:14:58Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - A Meta-heuristic Approach to Estimate and Explain Classifier Uncertainty [0.4264192013842096]
This work proposes a set of class-independent meta-heuristics that can characterize the complexity of an instance in terms of factors are mutually relevant to both human and machine learning decision-making.
The proposed measures and framework hold promise for improving model development for more complex instances, as well as providing a new means of model abstention and explanation.
arXiv Detail & Related papers (2023-04-20T13:09:28Z) - Tight Guarantees for Interactive Decision Making with the
Decision-Estimation Coefficient [51.37720227675476]
We introduce a new variant of the Decision-Estimation Coefficient, and use it to derive new lower bounds that improve upon prior work on three fronts.
We provide upper bounds on regret that scale with the same quantity, thereby closing all but one of the gaps between upper and lower bounds in Foster et al.
Our results apply to both the regret framework and PAC framework, and make use of several new analysis and algorithm design techniques that we anticipate will find broader use.
arXiv Detail & Related papers (2023-01-19T18:24:08Z) - Model-Free Reinforcement Learning with the Decision-Estimation
Coefficient [79.30248422988409]
We consider the problem of interactive decision making, encompassing structured bandits and reinforcement learning with general function approximation.
We use this approach to derive regret bounds for model-free reinforcement learning with value function approximation, and give structural results showing when it can and cannot help more generally.
arXiv Detail & Related papers (2022-11-25T17:29:40Z) - The Statistical Complexity of Interactive Decision Making [126.04974881555094]
We provide a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning.
A unified algorithm design principle, Estimation-to-Decisions (E2D), transforms any algorithm for supervised estimation into an online algorithm for decision making.
arXiv Detail & Related papers (2021-12-27T02:53:44Z) - Risk-Sensitive Reinforcement Learning: a Martingale Approach to Reward
Uncertainty [15.572157454411533]
We introduce a novel framework to account for sensitivity to rewards uncertainty in sequential decision-making problems.
We present a new decomposition of the randomness contained in the cumulative reward based on the Doob decomposition of a process.
We innovate on the reinforcement learning side by incorporating this new risk-sensitive approach into model-free algorithms, both policy and value gradient function based, and illustrate its relevance on grid world and portfolio optimization problems.
arXiv Detail & Related papers (2020-06-23T01:20:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.