Semi-Supervised Active Learning with Temporal Output Discrepancy
- URL: http://arxiv.org/abs/2107.14153v1
- Date: Thu, 29 Jul 2021 16:25:56 GMT
- Title: Semi-Supervised Active Learning with Temporal Output Discrepancy
- Authors: Siyu Huang, Tianyang Wang, Haoyi Xiong, Jun Huan, Dejing Dou
- Abstract summary: We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
- Score: 42.01906895756629
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While deep learning succeeds in a wide range of tasks, it highly depends on
the massive collection of annotated data which is expensive and time-consuming.
To lower the cost of data annotation, active learning has been proposed to
interactively query an oracle to annotate a small proportion of informative
samples in an unlabeled dataset. Inspired by the fact that the samples with
higher loss are usually more informative to the model than the samples with
lower loss, in this paper we present a novel deep active learning approach that
queries the oracle for data annotation when the unlabeled sample is believed to
incorporate high loss. The core of our approach is a measurement Temporal
Output Discrepancy (TOD) that estimates the sample loss by evaluating the
discrepancy of outputs given by models at different optimization steps. Our
theoretical investigation shows that TOD lower-bounds the accumulated sample
loss thus it can be used to select informative unlabeled samples. On basis of
TOD, we further develop an effective unlabeled data sampling strategy as well
as an unsupervised learning criterion that enhances model performance by
incorporating the unlabeled data. Due to the simplicity of TOD, our active
learning approach is efficient, flexible, and task-agnostic. Extensive
experimental results demonstrate that our approach achieves superior
performances than the state-of-the-art active learning methods on image
classification and semantic segmentation tasks.
Related papers
- A CLIP-Powered Framework for Robust and Generalizable Data Selection [51.46695086779598]
Real-world datasets often contain redundant and noisy data, imposing a negative impact on training efficiency and model performance.
Data selection has shown promise in identifying the most representative samples from the entire dataset.
We propose a novel CLIP-powered data selection framework that leverages multimodal information for more robust and generalizable sample selection.
arXiv Detail & Related papers (2024-10-15T03:00:58Z) - Unsupervised Transfer Learning via Adversarial Contrastive Training [3.227277661633986]
We propose a novel unsupervised transfer learning approach using adversarial contrastive training (ACT)
Our experimental results demonstrate outstanding classification accuracy with both fine-tuned linear probe and K-NN protocol across various datasets.
arXiv Detail & Related papers (2024-08-16T05:11:52Z) - Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Optimal Sample Selection Through Uncertainty Estimation and Its
Application in Deep Learning [22.410220040736235]
We present a theoretically optimal solution for addressing both coreset selection and active learning.
Our proposed method, COPS, is designed to minimize the expected loss of a model trained on subsampled data.
arXiv Detail & Related papers (2023-09-05T14:06:33Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - Semi-supervised Active Learning for Instance Segmentation via Scoring
Predictions [25.408505612498423]
We propose a novel and principled semi-supervised active learning framework for instance segmentation.
Specifically, we present an uncertainty sampling strategy named Triplet Scoring Predictions (TSP) to explicitly incorporate samples ranking clues from classes, bounding boxes and masks.
Results on medical images datasets demonstrate that the proposed method results in the embodiment of knowledge from available data in a meaningful way.
arXiv Detail & Related papers (2020-12-09T02:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.