Related papers: Active Learning of Classifiers with Label and Seed Queries

Active Learning of Classifiers with Label and Seed Queries

URL: http://arxiv.org/abs/2209.03996v1
Date: Thu, 8 Sep 2022 18:46:23 GMT
Title: Active Learning of Classifiers with Label and Seed Queries
Authors: Marco Bressan, Nicol\`o Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice, Maximilian Thiessen
Abstract summary: In the standard active learning setting, where only label queries are allowed, learning a classifier with strong convex hull margin $gamma$ requires in the worst case $Omegabig(k m log frac1gammabig)$ seed queries. We show that, by carefully combining the two types of queries, a binary classifier can be learned in time $operatornamepoly(n+m)$ using only $O(m2 log n)$ label queries and $Obig(m log fracmgamma
Score: 18.34182076906661
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study exact active learning of binary and multiclass classifiers with margin. Given an $n$-point set $X \subset \mathbb{R}^m$, we want to learn any unknown classifier on $X$ whose classes have finite strong convex hull margin, a new notion extending the SVM margin. In the standard active learning setting, where only label queries are allowed, learning a classifier with strong convex hull margin $\gamma$ requires in the worst case $\Omega\big(1+\frac{1}{\gamma}\big)^{(m-1)/2}$ queries. On the other hand, using the more powerful seed queries (a variant of equivalence queries), the target classifier could be learned in $O(m \log n)$ queries via Littlestone's Halving algorithm; however, Halving is computationally inefficient. In this work we show that, by carefully combining the two types of queries, a binary classifier can be learned in time $\operatorname{poly}(n+m)$ using only $O(m^2 \log n)$ label queries and $O\big(m \log \frac{m}{\gamma}\big)$ seed queries; the result extends to $k$-class classifiers at the price of a $k!k^2$ multiplicative overhead. Similar results hold when the input points have bounded bit complexity, or when only one class has strong convex hull margin against the rest. We complement the upper bounds by showing that in the worst case any algorithm needs $\Omega\big(k m \log \frac{1}{\gamma}\big)$ seed and label queries to learn a $k$-class classifier with strong convex hull margin $\gamma$.

Related papers

Learning Partitions with Optimal Query and Round Complexities [16.815943270621638]
We consider the basic problem of learning an unknown partition of $n$ elements into at most $k$ sets.<n>Non-adaptive algorithms require $Theta(n2)$ queries, while adaptive algorithms require $Theta(nk)$ queries.<n>Our algorithm only needs $O(log log n)$ rounds to attain the optimal $O(nk)$ query complexity.
arXiv Detail & Related papers (2025-05-08T07:27:29Z)
Bandit Multiclass List Classification [28.483435983018616]
We study the problem of multiclass list classification with (semi-)bandit feedback, where input examples are mapped into subsets of size $m$ of a collection of $K$ possible labels. Our main result is for the $(varepsilon,delta)$-PAC variant of the problem for which we design an algorithm that returns an $varepsilon$-optimal hypothesis.
arXiv Detail & Related papers (2025-02-13T12:13:25Z)
Active Learning of General Halfspaces: Label Queries vs Membership Queries [35.41111301965335]
We show that any active learner requires label complexity of $tildeOmega(d/(log(m)epsilon))$, where $m$ is the number of unlabeled examples. We give a computationally efficient learner with query complexity of $tildeO(min1/p, 1/epsilon + dcdot polylog (1/epsilon))$ achieving error guarantee of $O(opt)+epsilon$.
arXiv Detail & Related papers (2024-12-31T15:40:48Z)
Clustering with Non-adaptive Subset Queries [16.662507957069813]
Given a query $S subset U$, $|S|=2$, the oracle returns yes if the points are in the same cluster and no otherwise. For adaptive algorithms with pair-wise queries, the number of required queries is known to be $Theta(nk)$. Non-adaptive schemes require $Omega(n2)$ queries, which matches the trivial $O(n2)$ upper bound attained by querying every pair of points.
arXiv Detail & Related papers (2024-09-17T05:56:07Z)
Online Learning of Halfspaces with Massart Noise [47.71073318490341]
We study the task of online learning in the presence of Massart noise. We present a computationally efficient algorithm that achieves mistake bound $eta T + o(T)$. We use our Massart online learner to design an efficient bandit algorithm that obtains expected reward at least $(1-1/k) Delta T - o(T)$ bigger than choosing a random action at every round.
arXiv Detail & Related papers (2024-05-21T17:31:10Z)
A quantum algorithm for learning a graph of bounded degree [1.8130068086063336]
We present an algorithm that learns the edges of $G$ in at most $tildeO(d2m3/4)$ quantum queries. In particular, we present a randomized algorithm that, with high probability, learns cycles and matchings in $tildeO(sqrtm)$ quantum queries.
arXiv Detail & Related papers (2024-02-28T21:23:40Z)
Distribution-Independent Regression for Generalized Linear Models with Oblivious Corruptions [49.69852011882769]
We show the first algorithms for the problem of regression for generalized linear models (GLMs) in the presence of additive oblivious noise. We present an algorithm that tackles newthis problem in its most general distribution-independent setting. This is the first newalgorithmic result for GLM regression newwith oblivious noise which can handle more than half the samples being arbitrarily corrupted.
arXiv Detail & Related papers (2023-09-20T21:41:59Z)
Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals. Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z)
Self-training Converts Weak Learners to Strong Learners in Mixture Models [86.7137362125503]
We show that a pseudolabeler $boldsymbolbeta_mathrmpl$ can achieve classification error at most $C_mathrmerr$. We additionally show that by running gradient descent on the logistic loss one can obtain a pseudolabeler $boldsymbolbeta_mathrmpl$ with classification error $C_mathrmerr$ using only $O(d)$ labeled examples.
arXiv Detail & Related papers (2021-06-25T17:59:16Z)
Semi-supervised Active Regression [21.51757844385258]
This paper studies the use of partially labelled, potentially biased data for learning tasks. The learner has access to a dataset $X in mathbbRd | X min_beta in mathbbRd | X beta - Y |2 end2 equation while making few additional label queries.
arXiv Detail & Related papers (2021-06-12T03:28:43Z)
Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$. We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z)
Active Local Learning [22.823171570933397]
We consider active local learning: given a query point $x$, and active access to an unlabeled training set $S$, output the prediction $h(x)$ of a near-optimal $h in H$. In particular, the number of label queries should be independent of the complexity of $H$, and the function $h$ should be well-defined, independent of $x$. This immediately also implies an algorithm for distance estimation: estimating the value $opt(H)$ from many fewer labels than needed to actually learn a near-optimal $h in
arXiv Detail & Related papers (2020-08-31T05:39:35Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
$Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z)
Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy. For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.