Related papers: PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision

PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision

URL: http://arxiv.org/abs/2407.06698v1
Date: Tue, 9 Jul 2024 09:19:01 GMT
Title: PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision
Authors: Chengjie Wang, Chengming Xu, Zhenye Gan, Jianlong Hu, Wenbing Zhu, Lizhuag Ma,
Abstract summary: Positive and Unlabeled (PU) learning, a binary classification model trained with only positive and unlabeled data, generally suffers from overfitted risk estimation due to inconsistent data distributions. We introduce a pseudo-supervised PU learning framework (PSPU), in which we train the PU model first, use it to gather confident samples for the pseudo supervision, and then apply these supervision to correct the PU model's weights. Our PSPU outperforms recent PU learning methods significantly on MNIST, CIFAR-10, CIFAR-100 in both balanced and imbalanced settings, and enjoys competitive performance on MVTecAD for industrial anomaly detection
Score: 27.690637059377643
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Positive and Unlabeled (PU) learning, a binary classification model trained with only positive and unlabeled data, generally suffers from overfitted risk estimation due to inconsistent data distributions. To address this, we introduce a pseudo-supervised PU learning framework (PSPU), in which we train the PU model first, use it to gather confident samples for the pseudo supervision, and then apply these supervision to correct the PU model's weights by leveraging non-PU objectives. We also incorporate an additional consistency loss to mitigate noisy sample effects. Our PSPU outperforms recent PU learning methods significantly on MNIST, CIFAR-10, CIFAR-100 in both balanced and imbalanced settings, and enjoys competitive performance on MVTecAD for industrial anomaly detection.

Related papers

STAR : Bridging Statistical and Agentic Reasoning for Large Model Performance Prediction [78.0692157478247]
We propose STAR, a framework that bridges data-driven STatistical expectations with knowledge-driven Agentic Reasoning.<n>We show that STAR consistently outperforms all baselines on both score-based and rank-based metrics.
arXiv Detail & Related papers (2026-02-12T16:30:07Z)
Noisy-Pair Robust Representation Alignment for Positive-Unlabeled Learning [24.345089357698985]
Negative-Unlabeled (PU) learning aims to train a binary classifier where only limited positive data and abundant unlabeled data are available.<n>We propose NcPU, a non-contrastive PU learning framework that requires no auxiliary information.<n>We show that NcPU achieves substantial improvements over state-of-the-art PU methods across diverse datasets.
arXiv Detail & Related papers (2025-09-30T18:22:30Z)
Positive and Unlabeled Data: Model, Estimation, Inference, and Classification [10.44075062541605]
This study introduces a new approach to addressing positive and unlabeled (PU) data through the double exponential tilting model (DETM) Traditional methods often fall short because they only apply to selected completely at random (SCAR) PU data. Our DETM's dual structure effectively accommodates the more complex and underexplored selected at random PU data.
arXiv Detail & Related papers (2024-07-13T00:57:04Z)
A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised Classification [61.473485511491795]
Semi-supervised learning (SSL) is a practical challenge in computer vision. Pseudo-label (PL) methods, e.g., FixMatch and FreeMatch, obtain the State Of The Art (SOTA) performances in SSL. We propose a lightweight channel-based ensemble method to consolidate multiple inferior PLs into the theoretically guaranteed unbiased and low-variance one.
arXiv Detail & Related papers (2024-03-27T09:49:37Z)
Towards Robust Federated Learning via Logits Calibration on Non-IID Data [49.286558007937856]
Federated learning (FL) is a privacy-preserving distributed management framework based on collaborative model training of distributed devices in edge networks. Recent studies have shown that FL is vulnerable to adversarial examples, leading to a significant drop in its performance. In this work, we adopt the adversarial training (AT) framework to improve the robustness of FL models against adversarial example (AE) attacks.
arXiv Detail & Related papers (2024-03-05T09:18:29Z)
Understanding Contrastive Representation Learning from Positive Unlabeled (PU) Data [28.74519165747641]
We study the problem of Positive Unlabeled (PU) learning, where only a small set of labeled positives and a large unlabeled pool are available. We introduce Positive Unlabeled Contrastive Learning (puCL), an unbiased and variance reducing contrastive objective. When the class prior is known, we propose Positive Unlabeled InfoNCE (puNCE), a prior-aware extension that re-weights unlabeled samples as soft positive negative mixtures.
arXiv Detail & Related papers (2024-02-08T20:20:54Z)
Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness [52.9493817508055]
We propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the model's zero-shot adversarial robustness. Our approach consistently improves clean accuracy by an average of 8.72%.
arXiv Detail & Related papers (2024-01-09T04:33:03Z)
Learning from Positive and Unlabeled Data with Augmented Classes [17.97372291914351]
We propose an unbiased risk estimator for PU learning with Augmented Classes (PUAC) We derive the estimation error bound for the proposed estimator, which provides a theoretical guarantee for its convergence to the optimal solution.
arXiv Detail & Related papers (2022-07-27T03:40:50Z)
Uncertainty-aware Pseudo-label Selection for Positive-Unlabeled Learning [10.014356492742074]
We propose to tackle the issues of imbalanced datasets and model calibration in a positive-unlabeled learning setting. By boosting the signal from the minority class, pseudo-labeling expands the labeled dataset with new samples from the unlabeled set. Within a series of experiments, PUUPL yields substantial performance gains in highly imbalanced settings.
arXiv Detail & Related papers (2022-01-31T12:55:47Z)
Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration [83.29856873525674]
We introduce a lightweight latent variable model UOTA, targeting the view sampling issue for self-supervised learning. Our method directly generalizes to many mainstream self-supervised learning approaches.
arXiv Detail & Related papers (2021-12-15T14:05:23Z)
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation. We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models. We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
Robust Disentanglement of a Few Factors at a Time [5.156484100374058]
We introduce population-based training (PBT) for improving consistency in training variational autoencoders (VAEs) We then use Unsupervised Disentanglement Ranking (UDR) as an unsupervised to score models in our PBT-VAE training and show how models trained this way tend to consistently disentangle only a subset of the generative factors. We show striking improvement in state-of-the-art unsupervised disentanglement performance and robustness across multiple datasets and metrics.
arXiv Detail & Related papers (2020-10-26T12:34:23Z)
Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness. We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z)
MixPUL: Consistency-based Augmentation for Positive and Unlabeled Learning [8.7382177147041]
We propose a simple yet effective data augmentation method, coinedalgo, based on emphconsistency regularization. algoincorporates supervised and unsupervised consistency training to generate augmented data. We show thatalgoachieves an averaged improvement of classification error from 16.49 to 13.09 on the CIFAR-10 dataset across different positive data amount.
arXiv Detail & Related papers (2020-04-20T15:43:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.