Active Learning with Neural Networks: Insights from Nonparametric
Statistics
- URL: http://arxiv.org/abs/2210.08367v1
- Date: Sat, 15 Oct 2022 19:57:09 GMT
- Title: Active Learning with Neural Networks: Insights from Nonparametric
Statistics
- Authors: Yinglun Zhu and Robert Nowak
- Abstract summary: This paper provides the first near-optimal label complexity guarantees for deep active learning.
Under standard low noise conditions, we show that active learning with neural networks can provably achieve the minimax label complexity.
We also develop an efficient deep active learning algorithm that achieves $mathsfpolylog(frac1epsilon)$ label complexity, without any low noise assumptions.
- Score: 12.315392649501101
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have great representation power, but typically require
large numbers of training examples. This motivates deep active learning methods
that can significantly reduce the amount of labeled training data. Empirical
successes of deep active learning have been recently reported in the
literature, however, rigorous label complexity guarantees of deep active
learning have remained elusive. This constitutes a significant gap between
theory and practice. This paper tackles this gap by providing the first
near-optimal label complexity guarantees for deep active learning. The key
insight is to study deep active learning from the nonparametric classification
perspective. Under standard low noise conditions, we show that active learning
with neural networks can provably achieve the minimax label complexity, up to
disagreement coefficient and other logarithmic terms. When equipped with an
abstention option, we further develop an efficient deep active learning
algorithm that achieves $\mathsf{polylog}(\frac{1}{\epsilon})$ label
complexity, without any low noise assumptions. We also provide extensions of
our results beyond the commonly studied Sobolev/H\"older spaces and develop
label complexity guarantees for learning in Radon $\mathsf{BV}^2$ spaces, which
have recently been proposed as natural function spaces associated with neural
networks.
Related papers
- Learning from the Best: Active Learning for Wireless Communications [9.523381807291049]
Active learning algorithms identify the most critical and informative samples in an unlabeled dataset and label only those samples, instead of the complete set.
We present a case study of deep learning-based mmWave beam selection, where labeling is performed by a compute-intensive algorithm based on exhaustive search.
Our results show that using an active learning algorithm for class-imbalanced datasets can reduce labeling overhead by up to 50% for this dataset.
arXiv Detail & Related papers (2024-01-23T12:21:57Z) - Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and
Luck [35.6883212537938]
We consider offline sparse parity learning, a supervised classification problem which admits a statistical query lower bound for gradient-based training of a multilayer perceptron.
We show, theoretically and experimentally, that sparse initialization and increasing network width yield significant improvements in sample efficiency in this setting.
We also show that the synthetic sparse parity task can be useful as a proxy for real problems requiring axis-aligned feature learning.
arXiv Detail & Related papers (2023-09-07T15:52:48Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Streaming Active Learning with Deep Neural Networks [44.50018541065145]
We propose VeSSAL, a new algorithm for batch active learning with deep neural networks in streaming settings.
VeSSAL samples groups of points to query for labels at the moment they are encountered.
We expand the applicability of deep neural networks to realistic active learning scenarios.
arXiv Detail & Related papers (2023-03-05T00:57:28Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Efficient Active Learning with Abstention [12.315392649501101]
We develop the first computationally efficient active learning algorithm with abstention.
A key feature of the algorithm is that it avoids the undesirable "noise-seeking" behavior often seen in active learning.
arXiv Detail & Related papers (2022-03-31T18:34:57Z) - BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch.
BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training.
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z) - What Makes Good Contrastive Learning on Small-Scale Wearable-based
Tasks? [59.51457877578138]
We study contrastive learning on the wearable-based activity recognition task.
This paper presents an open-source PyTorch library textttCL-HAR, which can serve as a practical tool for researchers.
arXiv Detail & Related papers (2022-02-12T06:10:15Z) - Learning Purified Feature Representations from Task-irrelevant Labels [18.967445416679624]
We propose a novel learning framework called PurifiedLearning to exploit task-irrelevant features extracted from task-irrelevant labels.
Our work is built on solid theoretical analysis and extensive experiments, which demonstrate the effectiveness of PurifiedLearning.
arXiv Detail & Related papers (2021-02-22T12:50:49Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Bayesian active learning for production, a systematic study and a
reusable library [85.32971950095742]
In this paper, we analyse the main drawbacks of current active learning techniques.
We do a systematic study on the effects of the most common issues of real-world datasets on the deep active learning process.
We derive two techniques that can speed up the active learning loop such as partial uncertainty sampling and larger query size.
arXiv Detail & Related papers (2020-06-17T14:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.