Active Learning of Classifiers with Label and Seed Queries
- URL: http://arxiv.org/abs/2209.03996v1
- Date: Thu, 8 Sep 2022 18:46:23 GMT
- Title: Active Learning of Classifiers with Label and Seed Queries
- Authors: Marco Bressan, Nicol\`o Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice,
Maximilian Thiessen
- Abstract summary: In the standard active learning setting, where only label queries are allowed, learning a classifier with strong convex hull margin $gamma$ requires in the worst case $Omegabig(k m log frac1gammabig)$ seed queries.
We show that, by carefully combining the two types of queries, a binary classifier can be learned in time $operatornamepoly(n+m)$ using only $O(m2 log n)$ label queries and $Obig(m log fracmgamma
- Score: 18.34182076906661
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study exact active learning of binary and multiclass classifiers with
margin. Given an $n$-point set $X \subset \mathbb{R}^m$, we want to learn any
unknown classifier on $X$ whose classes have finite strong convex hull margin,
a new notion extending the SVM margin. In the standard active learning setting,
where only label queries are allowed, learning a classifier with strong convex
hull margin $\gamma$ requires in the worst case
$\Omega\big(1+\frac{1}{\gamma}\big)^{(m-1)/2}$ queries. On the other hand,
using the more powerful seed queries (a variant of equivalence queries), the
target classifier could be learned in $O(m \log n)$ queries via Littlestone's
Halving algorithm; however, Halving is computationally inefficient. In this
work we show that, by carefully combining the two types of queries, a binary
classifier can be learned in time $\operatorname{poly}(n+m)$ using only $O(m^2
\log n)$ label queries and $O\big(m \log \frac{m}{\gamma}\big)$ seed queries;
the result extends to $k$-class classifiers at the price of a $k!k^2$
multiplicative overhead. Similar results hold when the input points have
bounded bit complexity, or when only one class has strong convex hull margin
against the rest. We complement the upper bounds by showing that in the worst
case any algorithm needs $\Omega\big(k m \log \frac{1}{\gamma}\big)$ seed and
label queries to learn a $k$-class classifier with strong convex hull margin
$\gamma$.
Related papers
- Clustering with Non-adaptive Subset Queries [16.662507957069813]
Given a query $S subset U$, $|S|=2$, the oracle returns yes if the points are in the same cluster and no otherwise.
For adaptive algorithms with pair-wise queries, the number of required queries is known to be $Theta(nk)$.
Non-adaptive schemes require $Omega(n2)$ queries, which matches the trivial $O(n2)$ upper bound attained by querying every pair of points.
arXiv Detail & Related papers (2024-09-17T05:56:07Z) - Online Learning of Halfspaces with Massart Noise [47.71073318490341]
We study the task of online learning in the presence of Massart noise.
We present a computationally efficient algorithm that achieves mistake bound $eta T + o(T)$.
We use our Massart online learner to design an efficient bandit algorithm that obtains expected reward at least $(1-1/k) Delta T - o(T)$ bigger than choosing a random action at every round.
arXiv Detail & Related papers (2024-05-21T17:31:10Z) - A quantum algorithm for learning a graph of bounded degree [1.8130068086063336]
We present an algorithm that learns the edges of $G$ in at most $tildeO(d2m3/4)$ quantum queries.
In particular, we present a randomized algorithm that, with high probability, learns cycles and matchings in $tildeO(sqrtm)$ quantum queries.
arXiv Detail & Related papers (2024-02-28T21:23:40Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Self-training Converts Weak Learners to Strong Learners in Mixture
Models [86.7137362125503]
We show that a pseudolabeler $boldsymbolbeta_mathrmpl$ can achieve classification error at most $C_mathrmerr$.
We additionally show that by running gradient descent on the logistic loss one can obtain a pseudolabeler $boldsymbolbeta_mathrmpl$ with classification error $C_mathrmerr$ using only $O(d)$ labeled examples.
arXiv Detail & Related papers (2021-06-25T17:59:16Z) - Semi-supervised Active Regression [21.51757844385258]
This paper studies the use of partially labelled, potentially biased data for learning tasks.
The learner has access to a dataset $X in mathbbRd | X min_beta in mathbbRd | X beta - Y |2 end2 equation while making few additional label queries.
arXiv Detail & Related papers (2021-06-12T03:28:43Z) - Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$.
We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z) - Active Local Learning [22.823171570933397]
We consider active local learning: given a query point $x$, and active access to an unlabeled training set $S$, output the prediction $h(x)$ of a near-optimal $h in H$.
In particular, the number of label queries should be independent of the complexity of $H$, and the function $h$ should be well-defined, independent of $x$.
This immediately also implies an algorithm for distance estimation: estimating the value $opt(H)$ from many fewer labels than needed to actually learn a near-optimal $h in
arXiv Detail & Related papers (2020-08-31T05:39:35Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.