A Competitive Algorithm for Agnostic Active Learning
- URL: http://arxiv.org/abs/2310.18786v3
- Date: Wed, 22 May 2024 18:58:01 GMT
- Title: A Competitive Algorithm for Agnostic Active Learning
- Authors: Eric Price, Yihan Zhou,
- Abstract summary: Most popular algorithms for active learning express their performance in terms of a parameter called the disagreement coefficient.
We get an algorithm that is competitive with the optimal algorithm for any binary hypothesis class $H$ and distribution $D_X$ over $X$.
It is NP-hard to do better than our algorithm's $O(log |H|)$ overhead in general.
- Score: 5.4579367210379335
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: For some hypothesis classes and input distributions, active agnostic learning needs exponentially fewer samples than passive learning; for other classes and distributions, it offers little to no improvement. The most popular algorithms for agnostic active learning express their performance in terms of a parameter called the disagreement coefficient, but it is known that these algorithms are inefficient on some inputs. We take a different approach to agnostic active learning, getting an algorithm that is competitive with the optimal algorithm for any binary hypothesis class $H$ and distribution $D_X$ over $X$. In particular, if any algorithm can use $m^*$ queries to get $O(\eta)$ error, then our algorithm uses $O(m^* \log |H|)$ queries to get $O(\eta)$ error. Our algorithm lies in the vein of the splitting-based approach of Dasgupta [2004], which gets a similar result for the realizable ($\eta = 0$) setting. We also show that it is NP-hard to do better than our algorithm's $O(\log |H|)$ overhead in general.
Related papers
- A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - Agnostic Membership Query Learning with Nontrivial Savings: New Results,
Techniques [0.0]
We consider learning with membership queries for classes at the frontier of learning.
This approach is inspired by and continues the study of linearlearning with nontrivial savings''
We establish agnostic learning algorithms for circuits consisting of a sublinear number of gates.
arXiv Detail & Related papers (2023-11-11T23:46:48Z) - Robust Approximation Algorithms for Non-monotone $k$-Submodular
Maximization under a Knapsack Constraint [0.0]
Two deterministic approximation algorithms are presented for the problem of non-monotone $k$-submodular complexity under a knapsack constraint.
Our algorithms provide constant approximation ratios within only $O(nk)$ query complexity for the non-monotone objective.
arXiv Detail & Related papers (2023-09-21T12:42:52Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Agnostic proper learning of monotone functions: beyond the black-box
correction barrier [6.47243430672461]
Given $2tildeO(sqrtn/varepsilon)$ uniformly random examples of an unknown function $f:pm 1n rightarrow pm 1$, our algorithm outputs a hypothesis $g:pm 1n rightarrow pm 1$ that is monotone.
We also give an algorithm for estimating up to additive error $varepsilon$ the distance of an unknown function $f$ to monotone using a run-time of $2tildeO(sqrt
arXiv Detail & Related papers (2023-04-05T18:52:10Z) - Provably Efficient Reinforcement Learning via Surprise Bound [66.15308700413814]
We propose a provably efficient reinforcement learning algorithm (both computationally and statistically) with general value function approximations.
Our algorithm achieves reasonable regret bounds when applied to both the linear setting and the sparse high-dimensional linear setting.
arXiv Detail & Related papers (2023-02-22T20:21:25Z) - Differentially-Private Hierarchical Clustering with Provable
Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering.
We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset.
We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z) - Choosing the Right Algorithm With Hints From Complexity Theory [16.33500498939925]
We show that the Metropolis algorithm is clearly the best of all algorithms regarded for reasonable problem sizes.
An artificial algorithm of this type having an $O(n log n)$ runtime leads to the result that the significance-based compact genetic algorithm (sig-cGA) can solve the DLB problem in time $O(n log n)$ with high probability.
arXiv Detail & Related papers (2021-09-14T11:12:32Z) - Agnostic Q-learning with Function Approximation in Deterministic
Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems.
We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.