Learning from Positive and Unlabeled Data with Augmented Classes
- URL: http://arxiv.org/abs/2207.13274v1
- Date: Wed, 27 Jul 2022 03:40:50 GMT
- Title: Learning from Positive and Unlabeled Data with Augmented Classes
- Authors: Zhongnian Li, Liutao Yang, Zhongchen Ma, Tongfeng Sun, Xinzheng Xu and
Daoqiang Zhang
- Abstract summary: We propose an unbiased risk estimator for PU learning with Augmented Classes (PUAC)
We derive the estimation error bound for the proposed estimator, which provides a theoretical guarantee for its convergence to the optimal solution.
- Score: 17.97372291914351
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Positive Unlabeled (PU) learning aims to learn a binary classifier from only
positive and unlabeled data, which is utilized in many real-world scenarios.
However, existing PU learning algorithms cannot deal with the real-world
challenge in an open and changing scenario, where examples from unobserved
augmented classes may emerge in the testing phase. In this paper, we propose an
unbiased risk estimator for PU learning with Augmented Classes (PUAC) by
utilizing unlabeled data from the augmented classes distribution, which can be
easily collected in many real-world scenarios. Besides, we derive the
estimation error bound for the proposed estimator, which provides a theoretical
guarantee for its convergence to the optimal solution. Experiments on multiple
realistic datasets demonstrate the effectiveness of proposed approach.
Related papers
- An Unbiased Risk Estimator for Partial Label Learning with Augmented Classes [46.663081214928226]
We propose an unbiased risk estimator with theoretical guarantees for PLLAC.
We provide a theoretical analysis of the estimation error bound of PLLAC.
Experiments on benchmark, UCI and real-world datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-09-29T07:36:16Z) - Meta-learning for Positive-unlabeled Classification [40.11462237689747]
The proposed method minimizes the test classification risk after the model is adapted to PU data.
The method embeds each instance into a task-specific space using neural networks.
We empirically show that the proposed method outperforms existing methods with one synthetic and three real-world datasets.
arXiv Detail & Related papers (2024-06-06T01:50:01Z) - PUAL: A Classifier on Trifurcate Positive-Unlabeled Data [29.617810881312867]
We propose a PU classifier with asymmetric loss (PUAL)
We develop a kernel-based algorithm to enable PUAL to obtain non-linear decision boundary.
We show that, through experiments on both simulated and real-world datasets, PUAL can achieve satisfactory classification on trifurcate data.
arXiv Detail & Related papers (2024-05-31T16:18:06Z) - SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning [49.94607673097326]
We propose a highly adaptable framework, designated as SimPro, which does not rely on any predefined assumptions about the distribution of unlabeled data.
Our framework, grounded in a probabilistic model, innovatively refines the expectation-maximization algorithm.
Our method showcases consistent state-of-the-art performance across diverse benchmarks and data distribution scenarios.
arXiv Detail & Related papers (2024-02-21T03:39:04Z) - Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem.
We propose a consistent approach that does not rely on the uniform distribution assumption.
We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z) - A Generalized Unbiased Risk Estimator for Learning with Augmented
Classes [70.20752731393938]
Given unlabeled data, an unbiased risk estimator (URE) can be derived, which can be minimized for LAC with theoretical guarantees.
We propose a generalized URE that can be equipped with arbitrary loss functions while maintaining the theoretical guarantees.
arXiv Detail & Related papers (2023-06-12T06:52:04Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - Learning from Similarity-Confidence Data [94.94650350944377]
We investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data.
We propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate.
arXiv Detail & Related papers (2021-02-13T07:31:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.