Noisy-Pair Robust Representation Alignment for Positive-Unlabeled Learning
- URL: http://arxiv.org/abs/2510.01278v1
- Date: Tue, 30 Sep 2025 18:22:30 GMT
- Title: Noisy-Pair Robust Representation Alignment for Positive-Unlabeled Learning
- Authors: Hengwei Zhao, Zhengzhong Tu, Zhuo Zheng, Wei Wang, Junjue Wang, Rusty Feagin, Wenzhe Jiao,
- Abstract summary: Negative-Unlabeled (PU) learning aims to train a binary classifier where only limited positive data and abundant unlabeled data are available.<n>We propose NcPU, a non-contrastive PU learning framework that requires no auxiliary information.<n>We show that NcPU achieves substantial improvements over state-of-the-art PU methods across diverse datasets.
- Score: 24.345089357698985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Positive-Unlabeled (PU) learning aims to train a binary classifier (positive vs. negative) where only limited positive data and abundant unlabeled data are available. While widely applicable, state-of-the-art PU learning methods substantially underperform their supervised counterparts on complex datasets, especially without auxiliary negatives or pre-estimated parameters (e.g., a 14.26% gap on CIFAR-100 dataset). We identify the primary bottleneck as the challenge of learning discriminative representations under unreliable supervision. To tackle this challenge, we propose NcPU, a non-contrastive PU learning framework that requires no auxiliary information. NcPU combines a noisy-pair robust supervised non-contrastive loss (NoiSNCL), which aligns intra-class representations despite unreliable supervision, with a phantom label disambiguation (PLD) scheme that supplies conservative negative supervision via regret-based label updates. Theoretically, NoiSNCL and PLD can iteratively benefit each other from the perspective of the Expectation-Maximization framework. Empirically, extensive experiments demonstrate that: (1) NoiSNCL enables simple PU methods to achieve competitive performance; and (2) NcPU achieves substantial improvements over state-of-the-art PU methods across diverse datasets, including challenging datasets on post-disaster building damage mapping, highlighting its promise for real-world applications. Code: Code will be open-sourced after review.
Related papers
- Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms [54.58593451541316]
We propose the first PU learning benchmark to systematically compare PU learning algorithms.<n>We identify subtle yet critical factors that affect the realistic and fair evaluation of PU learning algorithms.
arXiv Detail & Related papers (2025-09-29T03:13:00Z) - Unlocking the Hidden Treasures: Enhancing Recommendations with Unlabeled Data [12.53644929739924]
Collaborative filtering (CF) stands as a cornerstone in recommender systems.<n>We introduce a novel positive-neutral-negative (PNN) learning paradigm.<n>PNN offers a promising solution to learning complex user preferences.
arXiv Detail & Related papers (2024-12-24T05:07:55Z) - Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques [65.55451717632317]
We study Preference-Based Multi-Agent Reinforcement Learning (PbMARL)<n>We identify the Nash equilibrium from a preference-only offline dataset in general-sum games.<n>Our findings underscore the multifaceted approach required for PbMARL.
arXiv Detail & Related papers (2024-09-01T13:14:41Z) - PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision [27.690637059377643]
Positive and Unlabeled (PU) learning, a binary classification model trained with only positive and unlabeled data, generally suffers from overfitted risk estimation due to inconsistent data distributions.
We introduce a pseudo-supervised PU learning framework (PSPU), in which we train the PU model first, use it to gather confident samples for the pseudo supervision, and then apply these supervision to correct the PU model's weights.
Our PSPU outperforms recent PU learning methods significantly on MNIST, CIFAR-10, CIFAR-100 in both balanced and imbalanced settings, and enjoys competitive performance on MVTecAD for industrial anomaly detection
arXiv Detail & Related papers (2024-07-09T09:19:01Z) - Unraveling the Impact of Heterophilic Structures on Graph Positive-Unlabeled Learning [71.9954600831939]
Positive-Unlabeled (PU) learning is vital in many real-world scenarios, but its application to graph data remains under-explored.
We unveil that a critical challenge for PU learning on graph lies on the edge heterophily, which directly violates the irreducibility assumption for Class-Prior Estimation.
In response to this challenge, we introduce a new method, named Graph PU Learning with Label Propagation Loss (GPL)
arXiv Detail & Related papers (2024-05-30T10:30:44Z) - Understanding Contrastive Representation Learning from Positive Unlabeled (PU) Data [28.74519165747641]
We study the problem of Positive Unlabeled (PU) learning, where only a small set of labeled positives and a large unlabeled pool are available.<n>We introduce Positive Unlabeled Contrastive Learning (puCL), an unbiased and variance reducing contrastive objective.<n>When the class prior is known, we propose Positive Unlabeled InfoNCE (puNCE), a prior-aware extension that re-weights unlabeled samples as soft positive negative mixtures.
arXiv Detail & Related papers (2024-02-08T20:20:54Z) - Robust Representation Learning for Unreliable Partial Label Learning [86.909511808373]
Partial Label Learning (PLL) is a type of weakly supervised learning where each training instance is assigned a set of candidate labels, but only one label is the ground-truth.
This is known as Unreliable Partial Label Learning (UPLL) that introduces an additional complexity due to the inherent unreliability and ambiguity of partial labels.
We propose the Unreliability-Robust Representation Learning framework (URRL) that leverages unreliability-robust contrastive learning to help the model fortify against unreliable partial labels effectively.
arXiv Detail & Related papers (2023-08-31T13:37:28Z) - Learning from Positive and Unlabeled Data with Augmented Classes [17.97372291914351]
We propose an unbiased risk estimator for PU learning with Augmented Classes (PUAC)
We derive the estimation error bound for the proposed estimator, which provides a theoretical guarantee for its convergence to the optimal solution.
arXiv Detail & Related papers (2022-07-27T03:40:50Z) - Incorporating Semi-Supervised and Positive-Unlabeled Learning for
Boosting Full Reference Image Quality Assessment [73.61888777504377]
Full-reference (FR) image quality assessment (IQA) evaluates the visual quality of a distorted image by measuring its perceptual difference with pristine-quality reference.
Unlabeled data can be easily collected from an image degradation or restoration process, making it encouraging to exploit unlabeled training data to boost FR-IQA performance.
In this paper, we suggest to incorporate semi-supervised and positive-unlabeled (PU) learning for exploiting unlabeled data while mitigating the adverse effect of outliers.
arXiv Detail & Related papers (2022-04-19T09:10:06Z) - On Leveraging Unlabeled Data for Concurrent Positive-Unlabeled Classification and Robust Generation [72.062661402124]
We present a novel training framework to jointly target PU classification and conditional generation when exposed to extra data.<n>We prove the optimal condition of CNI-CGAN and experimentally, we conducted extensive evaluations on diverse datasets.
arXiv Detail & Related papers (2020-06-14T08:27:40Z) - MixPUL: Consistency-based Augmentation for Positive and Unlabeled
Learning [8.7382177147041]
We propose a simple yet effective data augmentation method, coinedalgo, based on emphconsistency regularization.
algoincorporates supervised and unsupervised consistency training to generate augmented data.
We show thatalgoachieves an averaged improvement of classification error from 16.49 to 13.09 on the CIFAR-10 dataset across different positive data amount.
arXiv Detail & Related papers (2020-04-20T15:43:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.