Refined approachability algorithms and application to regret
minimization with global costs
- URL: http://arxiv.org/abs/2009.03831v4
- Date: Tue, 7 Sep 2021 15:22:28 GMT
- Title: Refined approachability algorithms and application to regret
minimization with global costs
- Authors: Joon Kwon
- Abstract summary: Blackwell's approachability is a framework where two players, the Decision Maker and the Environment, play a repeated game with vector-valued payoffs.
We construct and analyze a class of Follow the Regularized Leader algorithms (FTRL) for Blackwell's approachability.
This flexibility enables us to apply these algorithms to closely minimize the quantity of interest in various online learning problems.
- Score: 0.38073142980732994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Blackwell's approachability is a framework where two players, the Decision
Maker and the Environment, play a repeated game with vector-valued payoffs. The
goal of the Decision Maker is to make the average payoff converge to a given
set called the target. When this is indeed possible, simple algorithms which
guarantee the convergence are known. This abstract tool was successfully used
for the construction of optimal strategies in various repeated games, but also
found several applications in online learning. By extending an approach
proposed by (Abernethy et al., 2011), we construct and analyze a class of
Follow the Regularized Leader algorithms (FTRL) for Blackwell's approachability
which are able to minimize not only the Euclidean distance to the target set
(as it is often the case in the context of Blackwell's approachability) but a
wide range of distance-like quantities. This flexibility enables us to apply
these algorithms to closely minimize the quantity of interest in various online
learning problems. In particular, for regret minimization with $\ell_p$ global
costs, we obtain the first bounds with explicit dependence in $p$ and the
dimension $d$.
Related papers
- Learning to Cover: Online Learning and Optimization with Irreversible Decisions [50.5775508521174]
We find that regret grows sub-linearly at a rate $Thetaleft(mfrac12cdotfrac11-2-Tright)$, thus converging exponentially fast to $Theta(sqrtm)$.
These findings underscore the benefits of limited online learning and optimization, in that even a few rounds can provide significant benefits as compared to a no-learning baseline.
arXiv Detail & Related papers (2024-06-20T23:00:25Z) - Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems.
We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z) - PDE-Based Optimal Strategy for Unconstrained Online Learning [40.61498562988079]
We present a framework that generates time-varying potential functions by solving a Partial Differential Equation (PDE)
Our framework recovers some classical potentials, and more importantly provides a systematic approach to design new ones.
This is the first parameter-free algorithm with optimal leading constant.
arXiv Detail & Related papers (2022-01-19T22:21:21Z) - Online Allocation with Two-sided Resource Constraints [44.5635910908944]
We consider an online allocation problem subject to lower and upper resource constraints, where the requests arrive sequentially.
We propose a new algorithm that obtains $1-O(fracepsilonalpha-epsilon)$ -competitive ratio for the offline problems that know the entire requests ahead of time.
arXiv Detail & Related papers (2021-12-28T02:21:06Z) - Adaptive Multi-Goal Exploration [118.40427257364729]
We show how AdaGoal can be used to tackle the objective of learning an $epsilon$-optimal goal-conditioned policy.
AdaGoal is anchored in the high-level algorithmic structure of existing methods for goal-conditioned deep reinforcement learning.
arXiv Detail & Related papers (2021-11-23T17:59:50Z) - Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
Reinforcement Learning [52.76230802067506]
A novel model-free algorithm is proposed to minimize regret in episodic reinforcement learning.
The proposed algorithm employs an em early-settled reference update rule, with the aid of two Q-learning sequences.
The design principle of our early-settled variance reduction method might be of independent interest to other RL settings.
arXiv Detail & Related papers (2021-10-09T21:13:48Z) - Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which.
machines can independently compute gradients, and cooperate.
Our algorithm is based on a new concentration technique, and its sample complexity.
It is very practical: it improves upon the performance of all prior methods when no.
setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.