Cost-Sensitive Unbiased Risk Estimation for Multi-Class Positive-Unlabeled Learning
- URL: http://arxiv.org/abs/2510.25226v1
- Date: Wed, 29 Oct 2025 07:01:32 GMT
- Title: Cost-Sensitive Unbiased Risk Estimation for Multi-Class Positive-Unlabeled Learning
- Authors: Miao Zhang, Junpeng Li, Changchun Hua, Yana Yang,
- Abstract summary: Positive--Unlabeled (PU) learning considers settings in which only positive and unlabeled data are available, while negatives are missing or left unlabeled.<n>We propose a cost-sensitive multi-class PU method based on emphadaptive loss weighting.<n>Experiments on textbfeight public datasets, spanning varying class priors and numbers of classes, show consistent gains over strong baselines in both accuracy and stability.
- Score: 33.15955234458642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Positive--Unlabeled (PU) learning considers settings in which only positive and unlabeled data are available, while negatives are missing or left unlabeled. This situation is common in real applications where annotating reliable negatives is difficult or costly. Despite substantial progress in PU learning, the multi-class case (MPU) remains challenging: many existing approaches do not ensure \emph{unbiased risk estimation}, which limits performance and stability. We propose a cost-sensitive multi-class PU method based on \emph{adaptive loss weighting}. Within the empirical risk minimization framework, we assign distinct, data-dependent weights to the positive and \emph{inferred-negative} (from the unlabeled mixture) loss components so that the resulting empirical objective is an unbiased estimator of the target risk. We formalize the MPU data-generating process and establish a generalization error bound for the proposed estimator. Extensive experiments on \textbf{eight} public datasets, spanning varying class priors and numbers of classes, show consistent gains over strong baselines in both accuracy and stability.
Related papers
- Semi-Supervised Regression with Heteroscedastic Pseudo-Labels [50.54050677867914]
We propose an uncertainty-aware pseudo-labeling framework that dynamically adjusts pseudo-label influence from a bi-level optimization perspective.<n>We provide theoretical insights and extensive experiments to validate our approach across various benchmark SSR datasets.
arXiv Detail & Related papers (2025-10-17T03:06:23Z) - Learning from M-Tuple Dominant Positive and Unlabeled Data [9.568395664931504]
This paper proposes a generalized learning framework emphMDPU to better align with real-world application scenarios.<n>We derive an unbiased risk estimator that satisfies risk consistency based on the empirical risk minimization (ERM) method.<n>To mitigate the inevitable overfitting issue during training, a risk correction method is introduced, leading to the development of a corrected risk estimator.
arXiv Detail & Related papers (2025-05-25T13:20:11Z) - Constraint Multi-class Positive and Unlabeled Learning for Distantly Supervised Named Entity Recognition [4.532252099910803]
We present a novel approach called textbfConstraint textbfMulti-class textbfPositive and textbfUn Learning (CMPU), which introduces a constraint factor on the risk estimator of multiple positive classes.<n>It suggests that the constraint non-negative risk estimator is more robust against overfitting than previous PU learning methods with limited positive data.
arXiv Detail & Related papers (2025-04-07T11:51:41Z) - An Unbiased Risk Estimator for Partial Label Learning with Augmented Classes [46.663081214928226]
We propose an unbiased risk estimator with theoretical guarantees for PLLAC.
We provide a theoretical analysis of the estimation error bound of PLLAC.
Experiments on benchmark, UCI and real-world datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-09-29T07:36:16Z) - Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem.
We propose a consistent approach that does not rely on the uniform distribution assumption.
We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - Uncertainty-Aware Instance Reweighting for Off-Policy Learning [63.31923483172859]
We propose a Uncertainty-aware Inverse Propensity Score estimator (UIPS) for improved off-policy learning.
Experiment results on synthetic and three real-world recommendation datasets demonstrate the advantageous sample efficiency of the proposed UIPS estimator.
arXiv Detail & Related papers (2023-03-11T11:42:26Z) - Learning from Positive and Unlabeled Data with Augmented Classes [17.97372291914351]
We propose an unbiased risk estimator for PU learning with Augmented Classes (PUAC)
We derive the estimation error bound for the proposed estimator, which provides a theoretical guarantee for its convergence to the optimal solution.
arXiv Detail & Related papers (2022-07-27T03:40:50Z) - Learning from Multiple Unlabeled Datasets with Partial Risk
Regularization [80.54710259664698]
In this paper, we aim to learn an accurate classifier without any class labels.
We first derive an unbiased estimator of the classification risk that can be estimated from the given unlabeled sets.
We then find that the classifier obtained as such tends to cause overfitting as its empirical risks go negative during training.
Experiments demonstrate that our method effectively mitigates overfitting and outperforms state-of-the-art methods for learning from multiple unlabeled sets.
arXiv Detail & Related papers (2022-07-04T16:22:44Z) - Uncertainty-aware Pseudo-label Selection for Positive-Unlabeled Learning [10.014356492742074]
We propose to tackle the issues of imbalanced datasets and model calibration in a positive-unlabeled learning setting.
By boosting the signal from the minority class, pseudo-labeling expands the labeled dataset with new samples from the unlabeled set.
Within a series of experiments, PUUPL yields substantial performance gains in highly imbalanced settings.
arXiv Detail & Related papers (2022-01-31T12:55:47Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.