Related papers: Efficient Algorithms for Extreme Bandits

Efficient Algorithms for Extreme Bandits

URL: http://arxiv.org/abs/2203.10883v1
Date: Mon, 21 Mar 2022 11:09:34 GMT
Title: Efficient Algorithms for Extreme Bandits
Authors: Dorian Baudry, Yoan Russac, Emilie Kaufmann
Abstract summary: We contribute to the Extreme Bandit problem, a variant of Multi-Armed Bandits in which the learner seeks to collect the largest possible reward. We first study the concentration of the maximum of i.i.d random variables under mild assumptions on the tail of the rewards distributions. We then propose and analyze a more adaptive, anytime algorithm, QoMax-SDA, which combines QoMax with a subsampling method recently introduced by Baudry et al.
Score: 20.68824391770909
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we contribute to the Extreme Bandit problem, a variant of Multi-Armed Bandits in which the learner seeks to collect the largest possible reward. We first study the concentration of the maximum of i.i.d random variables under mild assumptions on the tail of the rewards distributions. This analysis motivates the introduction of Quantile of Maxima (QoMax). The properties of QoMax are sufficient to build an Explore-Then-Commit (ETC) strategy, QoMax-ETC, achieving strong asymptotic guarantees despite its simplicity. We then propose and analyze a more adaptive, anytime algorithm, QoMax-SDA, which combines QoMax with a subsampling method recently introduced by Baudry et al. (2021). Both algorithms are more efficient than existing approaches in two aspects (1) they lead to better empirical performance (2) they enjoy a significant reduction of the memory and time complexities.

Related papers

Adaptive Sampled Softmax with Inverted Multi-Index: Methods, Theory and Applications [79.53938312089308]
The MIDX-Sampler is a novel adaptive sampling strategy based on an inverted multi-index approach. Our method is backed by rigorous theoretical analysis, addressing key concerns such as sampling bias, gradient bias, convergence rates, and generalization error bounds.
arXiv Detail & Related papers (2025-01-15T04:09:21Z)
Runtime Analysis of a Multi-Valued Compact Genetic Algorithm on Generalized OneMax [2.07180164747172]
We provide a first runtime analysis of a generalized OneMax function. We show that the r-cGA solves this r-valued OneMax problem efficiently. At the end of experiments, we state one conjecture related to the expected runtime of another variant of multi-valued OneMax function.
arXiv Detail & Related papers (2024-04-17T10:40:12Z)
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features [65.64276393443346]
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints. We present two new variants of the algorithms for minimization of the finite-sum gradient.
arXiv Detail & Related papers (2023-04-23T20:05:09Z)
Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning [18.762198598488066]
Multi-agent reinforcement learning (MARL) requires agents to explore within a vast joint action space. EMAX is a framework to seamlessly extend value-based MARL algorithms.
arXiv Detail & Related papers (2023-02-07T12:51:20Z)
Multi-block Min-max Bilevel Optimization with Applications in Multi-task Deep AUC Maximization [36.743632418094]
We present a single-loop algorithm to tackle a multiblock min-max bilevel optimization problem. We show that our algorithm can be used to tackle problems with hundreds of tasks.
arXiv Detail & Related papers (2022-06-01T06:42:36Z)
QAOA-in-QAOA: solving large-scale MaxCut problems on small quantum machines [81.4597482536073]
Quantum approximate optimization algorithms (QAOAs) utilize the power of quantum machines and inherit the spirit of adiabatic evolution. We propose QAOA-in-QAOA ($textQAOA2$) to solve arbitrary large-scale MaxCut problems using quantum machines. Our method can be seamlessly embedded into other advanced strategies to enhance the capability of QAOAs in large-scale optimization problems.
arXiv Detail & Related papers (2022-05-24T03:49:10Z)
Distributed Statistical Min-Max Learning in the Presence of Byzantine Agents [34.46660729815201]
We consider a multi-agent min-max learning problem, and focus on the emerging challenge of contending with Byzantine adversarial agents. Our main contribution is to provide a crisp analysis of the proposed robust extra-gradient algorithm for smooth convex-concave and smooth strongly convex-strongly concave functions. Our rates are near-optimal, and reveal both the effect of adversarial corruption and the benefit of collaboration among the non-faulty agents.
arXiv Detail & Related papers (2022-04-07T03:36:28Z)
Predicting parameters for the Quantum Approximate Optimization Algorithm for MAX-CUT from the infinite-size limit [0.05076419064097732]
We present an explicit algorithm to evaluate the performance of QAOA on MAX-CUT applied to random Erdos-Renyi graphs of expected degree $d$. This analysis yields an explicit mapping between QAOA parameters for MAX-CUT on Erdos-Renyi graphs, and the Sherrington-Kirkpatrick model.
arXiv Detail & Related papers (2021-10-20T17:58:53Z)
Minimax Optimization with Smooth Algorithmic Adversaries [59.47122537182611]
We propose a new algorithm for the min-player against smooth algorithms deployed by an adversary. Our algorithm is guaranteed to make monotonic progress having no limit cycles, and to find an appropriate number of gradient ascents.
arXiv Detail & Related papers (2021-06-02T22:03:36Z)
Stochastic Hard Thresholding Algorithms for AUC Maximization [49.00683387735522]
We develop a hard thresholding algorithm for AUC in distributiond classification. We conduct experiments to show the efficiency and effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-04T16:49:29Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)
Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.