Classification from Positive and Biased Negative Data with Skewed
Labeled Posterior Probability
- URL: http://arxiv.org/abs/2203.05749v1
- Date: Fri, 11 Mar 2022 04:31:35 GMT
- Title: Classification from Positive and Biased Negative Data with Skewed
Labeled Posterior Probability
- Authors: Shotaro Watanabe and Hidetoshi Matsui
- Abstract summary: We propose a new method to approach the positive and biased negative (PbN) classification problem.
We incorporate a method to correct the negative impact due to skewed confidence, which represents the posterior probability that the observed data are positive.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The binary classification problem has a situation where only biased data are
observed in one of the classes. In this paper, we propose a new method to
approach the positive and biased negative (PbN) classification problem, which
is a weakly supervised learning method to learn a binary classifier from
positive data and negative data with biased observations. We incorporate a
method to correct the negative impact due to skewed confidence, which
represents the posterior probability that the observed data are positive. This
reduces the distortion of the posterior probability that the data are labeled,
which is necessary for the empirical risk minimization of the PbN
classification problem. We verified the effectiveness of the proposed method by
numerical experiments and real data analysis.
Related papers
- Contrastive Learning with Negative Sampling Correction [52.990001829393506]
We propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL)
PUCL treats the generated negative samples as unlabeled samples and uses information from positive samples to correct bias in contrastive loss.
PUCL can be applied to general contrastive learning problems and outperforms state-of-the-art methods on various image and graph classification tasks.
arXiv Detail & Related papers (2024-01-13T11:18:18Z) - Joint empirical risk minimization for instance-dependent
positive-unlabeled data [4.112909937203119]
Learning from positive and unlabeled data (PU learning) is actively researched machine learning task.
The goal is to train a binary classification model based on a dataset containing part on positives which are labeled, and unlabeled instances.
Unlabeled set includes remaining part positives and all negative observations.
arXiv Detail & Related papers (2023-12-27T12:45:12Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - Automatic Debiased Learning from Positive, Unlabeled, and Exposure Data [11.217084610985674]
We address the issue of binary classification from positive and unlabeled data (PU classification) with a selection bias in the positive data.
This scenario represents a conceptual framework for many practical applications, such as recommender systems.
We propose a method to identify the function of interest using a strong ignorability assumption and develop an Automatic Debiased PUE'' (ADPUE) learning method.
arXiv Detail & Related papers (2023-03-08T18:45:22Z) - Learning From Positive and Unlabeled Data Using Observer-GAN [0.0]
A problem of learning from positive and unlabeled data (A.K.A. PU learning) has been studied in a binary (i.e., positive versus negative) classification setting.
Generative Adversarial Networks (GANs) have been used to reduce the problem to the supervised setting with the advantage that supervised learning has state-of-the-art accuracy in classification tasks.
arXiv Detail & Related papers (2022-08-26T07:35:28Z) - Nonuniform Negative Sampling and Log Odds Correction with Rare Events
Data [15.696653979226113]
We investigate the issue of parameter estimation with nonuniform negative sampling for imbalanced data.
We derive a general inverse probability weighted (IPW) estimator and obtain the optimal sampling probability that minimizes its variance.
Both theoretical and empirical results demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2021-10-25T15:37:22Z) - A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels [49.990938653249415]
This research presents a methodology that assigns initial pseudo-labels to unlabeled data which is used as noisy-labeled data, and trains a deep neural network using the noisy-labeled data.
Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-03-08T11:46:02Z) - Learning from Similarity-Confidence Data [94.94650350944377]
We investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data.
We propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate.
arXiv Detail & Related papers (2021-02-13T07:31:16Z) - Learning from Positive and Unlabeled Data with Arbitrary Positive Shift [11.663072799764542]
This paper shows that PU learning is possible even with arbitrarily non-representative positive data given unlabeled data.
We integrate this into two statistically consistent methods to address arbitrary positive bias.
Experimental results demonstrate our methods' effectiveness across numerous real-world datasets.
arXiv Detail & Related papers (2020-02-24T13:53:22Z) - On Positive-Unlabeled Classification in GAN [130.43248168149432]
This paper defines a positive and unlabeled classification problem for standard GANs.
It then leads to a novel technique to stabilize the training of the discriminator in GANs.
arXiv Detail & Related papers (2020-02-04T05:59:37Z) - Binary Classification from Positive Data with Skewed Confidence [85.18941440826309]
Positive-confidence (Pconf) classification is a promising weakly-supervised learning method.
In practice, the confidence may be skewed by bias arising in an annotation process.
We introduce the parameterized model of the skewed confidence, and propose the method for selecting the hyper parameter.
arXiv Detail & Related papers (2020-01-29T00:04:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.