Online Learning and Solving Infinite Games with an ERM Oracle
- URL: http://arxiv.org/abs/2307.01689v2
- Date: Mon, 10 Jul 2023 11:16:54 GMT
- Title: Online Learning and Solving Infinite Games with an ERM Oracle
- Authors: Angelos Assos, Idan Attias, Yuval Dagan, Constantinos Daskalakis,
Maxwell Fishelson
- Abstract summary: We propose an algorithm for online binary classification setting that relies solely on ERM oracle calls.
We show that it has finite regret in the realizable setting and sublinearly growing regret in the agnostic setting.
Our algorithms apply to both binary-valued and real-valued games and can be viewed as providing justification for the wide use of double oracle and multiple oracle algorithms in the practice of solving large games.
- Score: 20.1330044382824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While ERM suffices to attain near-optimal generalization error in the
stochastic learning setting, this is not known to be the case in the online
learning setting, where algorithms for general concept classes rely on
computationally inefficient oracles such as the Standard Optimal Algorithm
(SOA). In this work, we propose an algorithm for online binary classification
setting that relies solely on ERM oracle calls, and show that it has finite
regret in the realizable setting and sublinearly growing regret in the agnostic
setting. We bound the regret in terms of the Littlestone and threshold
dimensions of the underlying concept class.
We obtain similar results for nonparametric games, where the ERM oracle can
be interpreted as a best response oracle, finding the best response of a player
to a given history of play of the other players. In this setting, we provide
learning algorithms that only rely on best response oracles and converge to
approximate-minimax equilibria in two-player zero-sum games and approximate
coarse correlated equilibria in multi-player general-sum games, as long as the
game has a bounded fat-threshold dimension. Our algorithms apply to both
binary-valued and real-valued games and can be viewed as providing
justification for the wide use of double oracle and multiple oracle algorithms
in the practice of solving large games.
Related papers
- Playing Large Games with Oracles and AI Debate [27.355621483737913]
Existing algorithms for online game playing require per-iteration in the number of actions, which can be prohibitive for large games.
We give a novel efficient algorithm for simultaneous external and internal regret minimization whose regret depends logarithmically on the number of actions.
arXiv Detail & Related papers (2023-12-08T02:06:55Z) - Hardness of Independent Learning and Sparse Equilibrium Computation in
Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z) - Learning Correlated Equilibria in Mean-Field Games [62.14589406821103]
We develop the concepts of Mean-Field correlated and coarse-correlated equilibria.
We show that they can be efficiently learnt in emphall games, without requiring any additional assumption on the structure of the game.
arXiv Detail & Related papers (2022-08-22T08:31:46Z) - Smoothed Online Learning is as Easy as Statistical Learning [77.00766067963195]
We provide the first oracle-efficient, no-regret algorithms in this setting.
We show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits.
arXiv Detail & Related papers (2022-02-09T19:22:34Z) - No-Regret Learning in Time-Varying Zero-Sum Games [99.86860277006318]
Learning from repeated play in a fixed zero-sum game is a classic problem in game theory and online learning.
We develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under three performance measures.
Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property.
arXiv Detail & Related papers (2022-01-30T06:10:04Z) - Model-Free Online Learning in Unknown Sequential Decision Making
Problems and Games [114.90723492840499]
In large two-player zero-sum imperfect-information games, modern extensions of counterfactual regret minimization (CFR) are currently the practical state of the art for computing a Nash equilibrium.
We formalize an online learning setting in which the strategy space is not known to the agent.
We give an efficient algorithm that achieves $O(T3/4)$ regret with high probability for that setting, even when the agent faces an adversarial environment.
arXiv Detail & Related papers (2021-03-08T04:03:24Z) - Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior.
We develop and advocate for this hindsight framing of learning in general sequential decision-making settings.
We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z) - Learning Zero-Sum Simultaneous-Move Markov Games Using Function
Approximation and Correlated Equilibrium [116.56359444619441]
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games.
In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap.
In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret.
arXiv Detail & Related papers (2020-02-17T17:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.