Related papers: On the Complexity of Adversarial Decision Making

On the Complexity of Adversarial Decision Making

URL: http://arxiv.org/abs/2206.13063v1
Date: Mon, 27 Jun 2022 06:20:37 GMT
Title: On the Complexity of Adversarial Decision Making
Authors: Dylan J. Foster and Alexander Rakhlin and Ayush Sekhari and Karthik Sridharan
Abstract summary: We show that the Decision-Estimation Coefficient is necessary and sufficient to obtain low regret for adversarial decision making. We provide new structural results that connect the Decision-Estimation Coefficient to variants of other well-known complexity measures.
Score: 101.14158787665252
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A central problem in online learning and decision making -- from bandits to reinforcement learning -- is to understand what modeling assumptions lead to sample-efficient learning guarantees. We consider a general adversarial decision making framework that encompasses (structured) bandit problems with adversarial rewards and reinforcement learning problems with adversarial dynamics. Our main result is to show -- via new upper and lower bounds -- that the Decision-Estimation Coefficient, a complexity measure introduced by Foster et al. in the stochastic counterpart to our setting, is necessary and sufficient to obtain low regret for adversarial decision making. However, compared to the stochastic setting, one must apply the Decision-Estimation Coefficient to the convex hull of the class of models (or, hypotheses) under consideration. This establishes that the price of accommodating adversarial rewards or dynamics is governed by the behavior of the model class under convexification, and recovers a number of existing results -- both positive and negative. En route to obtaining these guarantees, we provide new structural results that connect the Decision-Estimation Coefficient to variants of other well-known complexity measures, including the Information Ratio of Russo and Van Roy and the Exploration-by-Optimization objective of Lattimore and Gy\"{o}rgy.

Related papers

Generalized Decision Focused Learning under Imprecise Uncertainty--Theoretical Study [6.137404366514538]
Decision Focused Learning has emerged as a critical paradigm for integrating machine learning with downstream optimisation. Existing methodologies predominantly rely on probabilistic models and focus narrowly on task objectives. This paper bridges these gaps by introducing innovative frameworks.
arXiv Detail & Related papers (2025-02-25T08:53:02Z)
Decision Making in Changing Environments: Robustness, Query-Based Learning, and Differential Privacy [59.64384863882473]
We study the problem of interactive decision making in which the underlying environment changes over time subject to given constraints. We propose a framework, which provides an complexity between the complexity and adversarial settings of decision making.
arXiv Detail & Related papers (2025-01-24T21:31:50Z)
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability [71.82666334363174]
We develop a unified framework for lower bound methods in statistical estimation and interactive decision making. We introduce a novel measure, decision dimension, which facilitates the complexity of new lower bounds for interactive decision making.
arXiv Detail & Related papers (2024-10-07T15:14:58Z)
Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences. Our method is especially suitable for problems with well-specified likelihoods. We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z)
A Meta-heuristic Approach to Estimate and Explain Classifier Uncertainty [0.4264192013842096]
This work proposes a set of class-independent meta-heuristics that can characterize the complexity of an instance in terms of factors are mutually relevant to both human and machine learning decision-making. The proposed measures and framework hold promise for improving model development for more complex instances, as well as providing a new means of model abstention and explanation.
arXiv Detail & Related papers (2023-04-20T13:09:28Z)
Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient [51.37720227675476]
We introduce a new variant of the Decision-Estimation Coefficient, and use it to derive new lower bounds that improve upon prior work on three fronts. We provide upper bounds on regret that scale with the same quantity, thereby closing all but one of the gaps between upper and lower bounds in Foster et al. Our results apply to both the regret framework and PAC framework, and make use of several new analysis and algorithm design techniques that we anticipate will find broader use.
arXiv Detail & Related papers (2023-01-19T18:24:08Z)
Model-Free Reinforcement Learning with the Decision-Estimation Coefficient [79.30248422988409]
We consider the problem of interactive decision making, encompassing structured bandits and reinforcement learning with general function approximation. We use this approach to derive regret bounds for model-free reinforcement learning with value function approximation, and give structural results showing when it can and cannot help more generally.
arXiv Detail & Related papers (2022-11-25T17:29:40Z)
The Statistical Complexity of Interactive Decision Making [126.04974881555094]
We provide a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning. A unified algorithm design principle, Estimation-to-Decisions (E2D), transforms any algorithm for supervised estimation into an online algorithm for decision making.
arXiv Detail & Related papers (2021-12-27T02:53:44Z)
Risk-Sensitive Reinforcement Learning: a Martingale Approach to Reward Uncertainty [15.572157454411533]
We introduce a novel framework to account for sensitivity to rewards uncertainty in sequential decision-making problems. We present a new decomposition of the randomness contained in the cumulative reward based on the Doob decomposition of a process. We innovate on the reinforcement learning side by incorporating this new risk-sensitive approach into model-free algorithms, both policy and value gradient function based, and illustrate its relevance on grid world and portfolio optimization problems.
arXiv Detail & Related papers (2020-06-23T01:20:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.