MixPUL: Consistency-based Augmentation for Positive and Unlabeled
Learning
- URL: http://arxiv.org/abs/2004.09388v1
- Date: Mon, 20 Apr 2020 15:43:33 GMT
- Title: MixPUL: Consistency-based Augmentation for Positive and Unlabeled
Learning
- Authors: Tong Wei, Feng Shi, Hai Wang, Wei-Wei Tu. Yu-Feng Li
- Abstract summary: We propose a simple yet effective data augmentation method, coinedalgo, based on emphconsistency regularization.
algoincorporates supervised and unsupervised consistency training to generate augmented data.
We show thatalgoachieves an averaged improvement of classification error from 16.49 to 13.09 on the CIFAR-10 dataset across different positive data amount.
- Score: 8.7382177147041
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning from positive and unlabeled data (PU learning) is prevalent in
practical applications where only a couple of examples are positively labeled.
Previous PU learning studies typically rely on existing samples such that the
data distribution is not extensively explored. In this work, we propose a
simple yet effective data augmentation method, coined~\algo, based on
\emph{consistency regularization} which provides a new perspective of using PU
data. In particular, the proposed~\algo~incorporates supervised and
unsupervised consistency training to generate augmented data. To facilitate
supervised consistency, reliable negative examples are mined from unlabeled
data due to the absence of negative samples. Unsupervised consistency is
further encouraged between unlabeled datapoints. In addition,~\algo~reduces
margin loss between positive and unlabeled pairs, which explicitly optimizes
AUC and yields faster convergence. Finally, we conduct a series of studies to
demonstrate the effectiveness of consistency regularization. We examined three
kinds of reliable negative mining methods. We show that~\algo~achieves an
averaged improvement of classification error from 16.49 to 13.09 on the
CIFAR-10 dataset across different positive data amount.
Related papers
- Safe Semi-Supervised Contrastive Learning Using In-Distribution Data as Positive Examples [3.4546761246181696]
We propose a self-supervised contrastive learning approach to fully exploit a large amount of unlabeled data.
The results show that self-supervised contrastive learning significantly improves classification accuracy.
arXiv Detail & Related papers (2024-08-03T22:33:13Z) - Beyond Myopia: Learning from Positive and Unlabeled Data through
Holistic Predictive Trends [26.79150786180822]
We unveil an intriguing yet long-overlooked observation in PUL.
Predictive trends for positive and negative classes display distinctly different patterns.
We propose a novel TPP-inspired measure for trend detection and prove its unbiasedness in predicting changes.
arXiv Detail & Related papers (2023-10-06T08:06:15Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - Robust Positive-Unlabeled Learning via Noise Negative Sample
Self-correction [48.929877651182885]
Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature.
We propose a new robust PU learning method with a training strategy motivated by the nature of human learning.
arXiv Detail & Related papers (2023-08-01T04:34:52Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Incorporating Semi-Supervised and Positive-Unlabeled Learning for
Boosting Full Reference Image Quality Assessment [73.61888777504377]
Full-reference (FR) image quality assessment (IQA) evaluates the visual quality of a distorted image by measuring its perceptual difference with pristine-quality reference.
Unlabeled data can be easily collected from an image degradation or restoration process, making it encouraging to exploit unlabeled training data to boost FR-IQA performance.
In this paper, we suggest to incorporate semi-supervised and positive-unlabeled (PU) learning for exploiting unlabeled data while mitigating the adverse effect of outliers.
arXiv Detail & Related papers (2022-04-19T09:10:06Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels [49.990938653249415]
This research presents a methodology that assigns initial pseudo-labels to unlabeled data which is used as noisy-labeled data, and trains a deep neural network using the noisy-labeled data.
Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-03-08T11:46:02Z) - Improving Positive Unlabeled Learning: Practical AUL Estimation and New
Training Method for Extremely Imbalanced Data Sets [10.870831090350402]
We improve Positive Unlabeled (PU) learning over state-of-the-art from two aspects.
First, we propose an unbiased practical AUL estimation method, which makes use of raw PU data without prior knowledge of unlabeled samples.
Secondly, we propose ProbTagging, a new training method for extremely imbalanced data sets.
arXiv Detail & Related papers (2020-04-21T08:32:57Z) - Learning from Positive and Unlabeled Data with Arbitrary Positive Shift [11.663072799764542]
This paper shows that PU learning is possible even with arbitrarily non-representative positive data given unlabeled data.
We integrate this into two statistically consistent methods to address arbitrary positive bias.
Experimental results demonstrate our methods' effectiveness across numerous real-world datasets.
arXiv Detail & Related papers (2020-02-24T13:53:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.