Online Sign Identification: Minimization of the Number of Errors in
Thresholding Bandits
- URL: http://arxiv.org/abs/2110.09133v1
- Date: Mon, 18 Oct 2021 09:36:36 GMT
- Title: Online Sign Identification: Minimization of the Number of Errors in
Thresholding Bandits
- Authors: Reda Ouhamma, R\'emy Degenne, Pierre Gaillard, Vianney Perchet
- Abstract summary: We introduce a large family of algorithms inspired by the Frank-Wolfe algorithm.
We construct new explicit algorithms for a broad class of problems.
We explain this phenomenon on an insightful toy problem.
- Score: 27.09804256642197
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In the fixed budget thresholding bandit problem, an algorithm sequentially
allocates a budgeted number of samples to different distributions. It then
predicts whether the mean of each distribution is larger or lower than a given
threshold. We introduce a large family of algorithms (containing most existing
relevant ones), inspired by the Frank-Wolfe algorithm, and provide a thorough
yet generic analysis of their performance. This allowed us to construct new
explicit algorithms, for a broad class of problems, whose losses are within a
small constant factor of the non-adaptive oracle ones. Quite interestingly, we
observed that adaptive methods empirically greatly out-perform non-adaptive
oracles, an uncommon behavior in standard online learning settings, such as
regret minimization. We explain this surprising phenomenon on an insightful toy
problem.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Agnostic Multi-Robust Learning Using ERM [19.313739782029185]
A fundamental problem in robust learning is asymmetry: a learner needs to correctly classify every one of exponentially-many perturbations that an adversary might make to a test-time natural example.
In contrast, the attacker only needs to find one successful perturbation.
We introduce a novel multi-group setting and introduce a novel multi-robust learning problem.
arXiv Detail & Related papers (2023-03-15T21:30:14Z) - Adaptive Federated Minimax Optimization with Lower Complexities [82.51223883622552]
We propose an efficient adaptive minimax optimization algorithm (i.e., AdaFGDA) to solve these minimax problems.
It builds our momentum-based reduced and localSGD techniques, and it flexibly incorporate various adaptive learning rates.
arXiv Detail & Related papers (2022-11-14T12:32:18Z) - Minimax rate of consistency for linear models with missing values [0.0]
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...).
In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task.
This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets.
arXiv Detail & Related papers (2022-02-03T08:45:34Z) - Machine Learning for Online Algorithm Selection under Censored Feedback [71.6879432974126]
In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms.
For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime.
In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem.
We adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon.
arXiv Detail & Related papers (2021-09-13T18:10:52Z) - Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise.
First-order guarantees are relatively well understood in statistical and online learning.
We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z) - Improved Algorithms for Agnostic Pool-based Active Classification [20.12178157010804]
We consider active learning for binary classification in the agnostic pool-based setting.
Our algorithm is superior to state of the art active learning algorithms on image classification datasets.
arXiv Detail & Related papers (2021-05-13T18:24:30Z) - Nonparametric adaptive active learning under local smoothness condition [0.76146285961466]
This paper adresses the problem of adaptive active learning in a nonparametric setting with minimal assumptions.
We present a novel algorithm that is valid under more general assumptions than the previously known algorithms.
Our algorithm achieves a minimax rate of convergence, and therefore performs almost as well as the best known non-adaptive algorithms.
arXiv Detail & Related papers (2021-02-22T14:47:21Z) - Thompson Sampling Algorithms for Mean-Variance Bandits [97.43678751629189]
We develop Thompson Sampling-style algorithms for mean-variance MAB.
We also provide comprehensive regret analyses for Gaussian and Bernoulli bandits.
Our algorithms significantly outperform existing LCB-based algorithms for all risk tolerances.
arXiv Detail & Related papers (2020-02-01T15:33:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.