The Statistical Complexity of Interactive Decision Making
- URL: http://arxiv.org/abs/2112.13487v3
- Date: Tue, 11 Jul 2023 16:47:42 GMT
- Title: The Statistical Complexity of Interactive Decision Making
- Authors: Dylan J. Foster and Sham M. Kakade and Jian Qian and Alexander Rakhlin
- Abstract summary: We provide a complexity measure, the Decision-Estimation Coefficient, that is proven to be both necessary and sufficient for sample-efficient interactive learning.
A unified algorithm design principle, Estimation-to-Decisions (E2D), transforms any algorithm for supervised estimation into an online algorithm for decision making.
- Score: 126.04974881555094
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A fundamental challenge in interactive learning and decision making, ranging
from bandit problems to reinforcement learning, is to provide sample-efficient,
adaptive learning algorithms that achieve near-optimal regret. This question is
analogous to the classical problem of optimal (supervised) statistical
learning, where there are well-known complexity measures (e.g., VC dimension
and Rademacher complexity) that govern the statistical complexity of learning.
However, characterizing the statistical complexity of interactive learning is
substantially more challenging due to the adaptive nature of the problem. The
main result of this work provides a complexity measure, the Decision-Estimation
Coefficient, that is proven to be both necessary and sufficient for
sample-efficient interactive learning. In particular, we provide:
1. a lower bound on the optimal regret for any interactive decision making
problem, establishing the Decision-Estimation Coefficient as a fundamental
limit.
2. a unified algorithm design principle, Estimation-to-Decisions (E2D), which
transforms any algorithm for supervised estimation into an online algorithm for
decision making. E2D attains a regret bound that matches our lower bound up to
dependence on a notion of estimation performance, thereby achieving optimal
sample-efficient learning as characterized by the Decision-Estimation
Coefficient.
Taken together, these results constitute a theory of learnability for
interactive decision making. When applied to reinforcement learning settings,
the Decision-Estimation Coefficient recovers essentially all existing hardness
results and lower bounds. More broadly, the approach can be viewed as a
decision-theoretic analogue of the classical Le Cam theory of statistical
estimation; it also unifies a number of existing approaches -- both Bayesian
and frequentist.
Related papers
- Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Tight Guarantees for Interactive Decision Making with the
Decision-Estimation Coefficient [51.37720227675476]
We introduce a new variant of the Decision-Estimation Coefficient, and use it to derive new lower bounds that improve upon prior work on three fronts.
We provide upper bounds on regret that scale with the same quantity, thereby closing all but one of the gaps between upper and lower bounds in Foster et al.
Our results apply to both the regret framework and PAC framework, and make use of several new analysis and algorithm design techniques that we anticipate will find broader use.
arXiv Detail & Related papers (2023-01-19T18:24:08Z) - Model-Free Reinforcement Learning with the Decision-Estimation
Coefficient [79.30248422988409]
We consider the problem of interactive decision making, encompassing structured bandits and reinforcement learning with general function approximation.
We use this approach to derive regret bounds for model-free reinforcement learning with value function approximation, and give structural results showing when it can and cannot help more generally.
arXiv Detail & Related papers (2022-11-25T17:29:40Z) - Efficient Learning of Decision-Making Models: A Penalty Block Coordinate
Descent Algorithm for Data-Driven Inverse Optimization [12.610576072466895]
We consider the inverse problem where we use prior decision data to uncover the underlying decision-making process.
This statistical learning problem is referred to as data-driven inverse optimization.
We propose an efficient block coordinate descent-based algorithm to solve large problem instances.
arXiv Detail & Related papers (2022-10-27T12:52:56Z) - Unified Algorithms for RL with Decision-Estimation Coefficients: PAC, Reward-Free, Preference-Based Learning, and Beyond [28.118197762236953]
We develop a unified algorithm framework for a large class of learning goals.
Our framework handles many learning goals such as no-regret RL, PAC RL, reward-free learning, model estimation, and preference-based learning.
As applications, we propose "decouplable representation" as a natural sufficient condition for bounding generalized DECs.
arXiv Detail & Related papers (2022-09-23T17:47:24Z) - On data-driven chance constraint learning for mixed-integer optimization
problems [0.0]
We develop a Chance Constraint Learning (CCL) methodology with a focus on mixed-integer linear optimization problems.
CCL makes use of linearizable machine learning models to estimate conditional quantiles of the learned variables.
An open-access software has been developed to be used by practitioners.
arXiv Detail & Related papers (2022-07-08T11:54:39Z) - On the Complexity of Adversarial Decision Making [101.14158787665252]
We show that the Decision-Estimation Coefficient is necessary and sufficient to obtain low regret for adversarial decision making.
We provide new structural results that connect the Decision-Estimation Coefficient to variants of other well-known complexity measures.
arXiv Detail & Related papers (2022-06-27T06:20:37Z) - Beyond Worst-Case Analysis in Stochastic Approximation: Moment
Estimation Improves Instance Complexity [58.70807593332932]
We study oracle complexity of gradient based methods for approximation problems.
We focus on instance-dependent complexity instead of worst case complexity.
Our proposed algorithm and its analysis provide a theoretical justification for the success of moment estimation.
arXiv Detail & Related papers (2020-06-08T09:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.