Forgetful Active Learning with Switch Events: Efficient Sampling for
Out-of-Distribution Data
- URL: http://arxiv.org/abs/2301.05106v1
- Date: Thu, 12 Jan 2023 16:03:14 GMT
- Title: Forgetful Active Learning with Switch Events: Efficient Sampling for
Out-of-Distribution Data
- Authors: Ryan Benkert, Mohit Prabhushankar, and Ghassan AlRegib
- Abstract summary: In practice, fully trained neural networks interact randomly with out-of-distribution (OOD) inputs.
We introduce forgetful active learning with switch events (FALSE) - a novel active learning protocol for out-of-distribution active learning.
We report up to 4.5% accuracy improvements in over 270 experiments.
- Score: 13.800680101300756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper considers deep out-of-distribution active learning. In practice,
fully trained neural networks interact randomly with out-of-distribution (OOD)
inputs and map aberrant samples randomly within the model representation space.
Since data representations are direct manifestations of the training
distribution, the data selection process plays a crucial role in outlier
robustness. For paradigms such as active learning, this is especially
challenging since protocols must not only improve performance on the training
distribution most effectively but further render a robust representation space.
However, existing strategies directly base the data selection on the data
representation of the unlabeled data which is random for OOD samples by
definition. For this purpose, we introduce forgetful active learning with
switch events (FALSE) - a novel active learning protocol for
out-of-distribution active learning. Instead of defining sample importance on
the data representation directly, we formulate "informativeness" with learning
difficulty during training. Specifically, we approximate how often the network
"forgets" unlabeled samples and query the most "forgotten" samples for
annotation. We report up to 4.5\% accuracy improvements in over 270
experiments, including four commonly used protocols, two OOD benchmarks, one
in-distribution benchmark, and three different architectures.
Related papers
- BoostAdapter: Improving Vision-Language Test-Time Adaptation via Regional Bootstrapping [64.8477128397529]
We propose a training-required and training-free test-time adaptation framework.
We maintain a light-weight key-value memory for feature retrieval from instance-agnostic historical samples and instance-aware boosting samples.
We theoretically justify the rationality behind our method and empirically verify its effectiveness on both the out-of-distribution and the cross-domain datasets.
arXiv Detail & Related papers (2024-10-20T15:58:43Z) - Deep Active Learning with Contrastive Learning Under Realistic Data Pool
Assumptions [2.578242050187029]
Active learning aims to identify the most informative data from an unlabeled data pool that enables a model to reach the desired accuracy rapidly.
Most existing active learning methods have been evaluated in an ideal setting where only samples relevant to the target task exist in an unlabeled data pool.
We introduce new active learning benchmarks that include ambiguous, task-irrelevant out-of-distribution as well as in-distribution samples.
arXiv Detail & Related papers (2023-03-25T10:46:10Z) - Gaussian Switch Sampling: A Second Order Approach to Active Learning [11.775252660867285]
In active learning, acquisition functions define informativeness directly on the representation position within the model manifold.
We propose a grounded second-order definition of information content and sample importance within the context of active learning.
We show that our definition produces highly accurate importance scores even when the model representations are constrained by the lack of training data.
arXiv Detail & Related papers (2023-02-16T15:24:56Z) - Learning from Data with Noisy Labels Using Temporal Self-Ensemble [11.245833546360386]
Deep neural networks (DNNs) have an enormous capacity to memorize noisy labels.
Current state-of-the-art methods present a co-training scheme that trains dual networks using samples associated with small losses.
We propose a simple yet effective robust training scheme that operates by training only a single network.
arXiv Detail & Related papers (2022-07-21T08:16:31Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for
Open-Set Semi-Supervised Learning [101.28281124670647]
Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data.
We propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning.
Our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-08-12T09:14:44Z) - Message Passing Adaptive Resonance Theory for Online Active
Semi-supervised Learning [30.19936050747407]
We propose Message Passing Adaptive Resonance Theory (MPART) for online active semi-supervised learning.
MPART infers the class of unlabeled data and selects informative and representative samples through message passing between nodes on the topological graph.
We evaluate our model with comparable query selection strategies and frequencies, showing that MPART significantly outperforms the competitive models in online active learning environments.
arXiv Detail & Related papers (2020-12-02T14:14:42Z) - SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data.
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Learning with Out-of-Distribution Data for Audio Classification [60.48251022280506]
We show that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning.
The proposed method is shown to improve the performance of convolutional neural networks by a significant margin.
arXiv Detail & Related papers (2020-02-11T21:08:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.