Black-box Generalization of Machine Teaching
- URL: http://arxiv.org/abs/2206.15205v2
- Date: Wed, 20 Sep 2023 15:01:26 GMT
- Title: Black-box Generalization of Machine Teaching
- Authors: Xiaofeng Cao, Yaming Guo, Ivor W. Tsang, James T. Kwok
- Abstract summary: We introduce a black-box teaching hypothesis $hmathcalT$ employing a tighter slack term $left.
We prove that, under the guidance of this teaching hypothesis, the learner can converge into a tighter generalization error and label complexity bound.
- Score: 63.384268678926325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hypothesis-pruning maximizes the hypothesis updates for active learning to
find those desired unlabeled data. An inherent assumption is that this learning
manner can derive those updates into the optimal hypothesis. However, its
convergence may not be guaranteed well if those incremental updates are
negative and disordered. In this paper, we introduce a black-box teaching
hypothesis $h^\mathcal{T}$ employing a tighter slack term
$\left(1+\mathcal{F}^{\mathcal{T}}(\widehat{h}_t)\right)\Delta_t$ to replace
the typical $2\Delta_t$ for pruning. Theoretically, we prove that, under the
guidance of this teaching hypothesis, the learner can converge into a tighter
generalization error and label complexity bound than those non-educated
learners who do not receive any guidance from a teacher:1) the generalization
error upper bound can be reduced from $R(h^*)+4\Delta_{T-1}$ to approximately
$R(h^{\mathcal{T}})+2\Delta_{T-1}$, and 2) the label complexity upper bound can
be decreased from $4 \theta\left(TR(h^{*})+2O(\sqrt{T})\right)$ to
approximately $2\theta\left(2TR(h^{\mathcal{T}})+3 O(\sqrt{T})\right)$. To be
strict with our assumption, self-improvement of teaching is firstly proposed
when $h^\mathcal{T}$ loosely approximates $h^*$. Against learning, we further
consider two teaching scenarios: teaching a white-box and black-box learner.
Experiments verify this idea and show better generalization performance than
the fundamental active learning strategies, such as IWAL, IWAL-D, etc.
Related papers
- On the $O(\rac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm [54.28350823319057]
This paper establishes the convergence rate $frac1Ksum_k=1KEleft[|nabla f(xk)|_1right]leq O(fracsqrtdCK1/4) for AdamW measured by $ell_$ norm, where $K$ represents the iteration number, $d denotes the model dimension, and $C$ matches the constant in the optimal convergence rate of SGD.
arXiv Detail & Related papers (2025-05-17T05:02:52Z) - On Agnostic PAC Learning in the Small Error Regime [4.422219522591412]
Empirical Risk Minimization learners are suboptimal in the realizable case yet optimal in the agnostic case.
Work of Hanneke, Larsen, and Zhivotovskiy addresses this shortcoming by including $tau$ as a parameter in the error term.
We show that our learner achieves tightness of the error lower bound $tau + Omega left(sqrtfractau))m + fracd + log (1 / delta)m right)$ for a constant $c leq 2.1$
arXiv Detail & Related papers (2025-02-13T17:03:03Z) - Deterministic Apple Tasting [2.4554686192257424]
We provide the first widely-applicable deterministic apple tasting learner.
We prove a trichotomy stating that every class $mathcalH$ must be either easy, hard, or unlearnable.
Our upper bound is based on a deterministic algorithm for learning from expert advice with apple tasting feedback.
arXiv Detail & Related papers (2024-10-14T11:54:46Z) - Revisiting Agnostic PAC Learning [30.67561230812141]
PAC learning, dating back to Valiant'84 and Vapnik and Chervonenkis'64,'74, is a classic model for studying supervised learning.
Empirical Risk Minimization (ERM) is a natural learning algorithm, where one simply outputs the hypothesis from $mathcalH$ making the fewest mistakes on the training data.
We revisit PAC learning and first show that ERM is in fact sub-optimal if we treat the performance of the best hypothesis, denoted $tau:=Pr_mathcalD[hstar_math
arXiv Detail & Related papers (2024-07-29T08:20:49Z) - On Optimal Learning Under Targeted Data Poisoning [48.907813854832206]
In this work we aim to characterize the smallest achievable error $epsilon=epsilon(eta)$ by the learner in the presence of such an adversary.
Remarkably, we show that the upper bound can be attained by a deterministic learner.
arXiv Detail & Related papers (2022-10-06T06:49:48Z) - Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Delaytron: Efficient Learning of Multiclass Classifiers with Delayed
Bandit Feedbacks [6.624726878647541]
Adaptive Delaytron achieves a regret bound of $mathcalOleft(sqrtfrac2 Kgammaleft[fracT2+left(2+fracL2R2Vert WVert_F2right)sum_t=1Td_tright.
We show that Adaptive Delaytron achieves a regret bound of $mathcalOleft(sqrtfrac2 Kgammaleft[fracT2
arXiv Detail & Related papers (2022-05-17T11:12:20Z) - Littlestone Classes are Privately Online Learnable [28.04975353867202]
We consider the problem of online classification under a privacy constraint.
In this setting a learner observes sequentially a stream of labelled examples $(x_t, y_t)$, for $1 leq t leq T$, and returns at each iteration a hypothesis $h_t$ which is used to predict the label of each new example $x_t$.
The learner's performance is measured by her regret against a known hypothesis class $mathcalH$.
arXiv Detail & Related papers (2021-06-25T09:08:33Z) - Nearly Horizon-Free Offline Reinforcement Learning [97.36751930393245]
We revisit offline reinforcement learning on episodic time-homogeneous Markov Decision Processes with $S$ states, $A$ actions and planning horizon $H$.
We obtain the first set of nearly $H$-free sample complexity bounds for evaluation and planning using the empirical MDPs.
arXiv Detail & Related papers (2021-03-25T18:52:17Z) - $Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z) - Taking a hint: How to leverage loss predictors in contextual bandits? [63.546913998407405]
We study learning in contextual bandits with the help of loss predictors.
We show that the optimal regret is $mathcalO(minsqrtT, sqrtmathcalETfrac13)$ when $mathcalE$ is known.
arXiv Detail & Related papers (2020-03-04T07:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.