Corruption Robust Active Learning
- URL: http://arxiv.org/abs/2106.11220v1
- Date: Mon, 21 Jun 2021 16:06:38 GMT
- Title: Corruption Robust Active Learning
- Authors: Yifang Chen, Simon S. Du, Kevin Jamieson
- Abstract summary: We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions.
We show that, in a benign corruption setting, the classical RobustCAL framework can achieve nearly the same label complexity guarantee as in the non-corrupted setting.
We propose a new algorithm which is provably correct without any assumptions on the presence of corruptions.
- Score: 43.279169081740726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We conduct theoretical studies on streaming-based active learning for binary
classification under unknown adversarial label corruptions. In this setting,
every time before the learner observes a sample, the adversary decides whether
to corrupt the label or not. First, we show that, in a benign corruption
setting (which includes the misspecification setting as a special case), with a
slight enlargement on the hypothesis elimination threshold, the classical
RobustCAL framework can (surprisingly) achieve nearly the same label complexity
guarantee as in the non-corrupted setting. However, this algorithm can fail in
the general corruption setting. To resolve this drawback, we propose a new
algorithm which is provably correct without any assumptions on the presence of
corruptions. Furthermore, this algorithm enjoys the minimax label complexity in
the non-corrupted setting (which is achieved by RobustCAL) and only requires
$\tilde{\mathcal{O}}(C_{\mathrm{total}})$ additional labels in the corrupted
setting to achieve $\mathcal{O}(\varepsilon + \frac{C_{\mathrm{total}}}{n})$,
where $\varepsilon$ is the target accuracy, $C_{\mathrm{total}}$ is the total
number of corruptions and $n$ is the total number of unlabeled samples.
Related papers
- Unconstrained Robust Online Convex Optimization [38.9652176513385]
We focus on the difficult unconstrained'' setting in which our algorithm must maintain low regret.<n>Our algorithms guarantee regret $ |u|G (sqrtT + k) $ when $G ge max_t |g_t|$ is known, where $k$ is a measure of the total amount of corruption.
arXiv Detail & Related papers (2025-06-15T09:21:15Z) - Bandit Multiclass List Classification [28.483435983018616]
We study the problem of multiclass list classification with (semi-)bandit feedback, where input examples are mapped into subsets of size $m$ of a collection of $K$ possible labels.
Our main result is for the $(varepsilon,delta)$-PAC variant of the problem for which we design an algorithm that returns an $varepsilon$-optimal hypothesis.
arXiv Detail & Related papers (2025-02-13T12:13:25Z) - Corruption-Robust Offline Reinforcement Learning with General Function
Approximation [60.91257031278004]
We investigate the problem of corruption in offline reinforcement learning (RL) with general function approximation.
Our goal is to find a policy that is robust to such corruption and minimizes the suboptimality gap with respect to the optimal policy for the uncorrupted Markov decision processes (MDPs)
arXiv Detail & Related papers (2023-10-23T04:07:26Z) - Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear
Contextual Bandits and Markov Decision Processes [59.61248760134937]
We propose an efficient algorithm to achieve a regret of $tildeO(sqrtT+zeta)$.
The proposed algorithm relies on the recently developed uncertainty-weighted least-squares regression from linear contextual bandit.
We generalize our algorithm to the episodic MDP setting and first achieve an additive dependence on the corruption level $zeta$.
arXiv Detail & Related papers (2022-12-12T15:04:56Z) - Minimax Rates for Robust Community Detection [19.229475414802213]
We study the problem of community detection in the block model with adversarial node corruptions.
Our main result is an efficient algorithm that can tolerate an $epsilon$-fraction of corruptions and unbounded error $O(epsilon) + e-fracC2 (1 pm o(1))$ where $C = (sqrta - sqrtb)2$ is the signal-to-noise ratio.
We show that our algorithms are doubly-robust in the sense that they work in an even more
arXiv Detail & Related papers (2022-07-25T04:45:16Z) - Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial
Corruptions [98.75618795470524]
We study the linear contextual bandit problem in the presence of adversarial corruption.
We propose a new algorithm based on the principle of optimism in the face of uncertainty.
arXiv Detail & Related papers (2022-05-13T17:58:58Z) - A Robust Phased Elimination Algorithm for Corruption-Tolerant Gaussian
Process Bandits [118.22458816174144]
We propose a novel robust elimination-type algorithm that runs in epochs, combines exploration with infrequent switching to select a small subset of actions, and plays each action for multiple time instants.
Our algorithm, GP Robust Phased Elimination (RGP-PE), successfully balances robustness to corruptions with exploration and exploitation.
We perform the first empirical study of robustness in the corrupted GP bandit setting, and show that our algorithm is robust against a variety of adversarial attacks.
arXiv Detail & Related papers (2022-02-03T21:19:36Z) - Linear Contextual Bandits with Adversarial Corruptions [91.38793800392108]
We study the linear contextual bandit problem in the presence of adversarial corruption.
We present a variance-aware algorithm that is adaptive to the level of adversarial contamination $C$.
arXiv Detail & Related papers (2021-10-25T02:53:24Z) - Improved Corruption Robust Algorithms for Episodic Reinforcement
Learning [43.279169081740726]
We study episodic reinforcement learning under unknown adversarial corruptions in both the rewards and the transition probabilities of the underlying system.
We propose new algorithms which, compared to the existing results, achieve strictly better regret bounds in terms of total corruptions.
Our results follow from a general algorithmic framework that combines corruption-robust policy elimination meta-algorithms, and plug-in reward-free exploration sub-algorithms.
arXiv Detail & Related papers (2021-02-13T07:04:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.