Automatic Debiased Learning from Positive, Unlabeled, and Exposure Data
- URL: http://arxiv.org/abs/2303.04797v1
- Date: Wed, 8 Mar 2023 18:45:22 GMT
- Title: Automatic Debiased Learning from Positive, Unlabeled, and Exposure Data
- Authors: Masahiro Kato, Shuting Wu, Kodai Kureishi, and Shota Yasui
- Abstract summary: We address the issue of binary classification from positive and unlabeled data (PU classification) with a selection bias in the positive data.
This scenario represents a conceptual framework for many practical applications, such as recommender systems.
We propose a method to identify the function of interest using a strong ignorability assumption and develop an Automatic Debiased PUE'' (ADPUE) learning method.
- Score: 11.217084610985674
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the issue of binary classification from positive and unlabeled
data (PU classification) with a selection bias in the positive data. During the
observation process, (i) a sample is exposed to a user, (ii) the user then
returns the label for the exposed sample, and (iii) we however can only observe
the positive samples. Therefore, the positive labels that we observe are a
combination of both the exposure and the labeling, which creates a selection
bias problem for the observed positive samples. This scenario represents a
conceptual framework for many practical applications, such as recommender
systems, which we refer to as ``learning from positive, unlabeled, and exposure
data'' (PUE classification). To tackle this problem, we initially assume access
to data with exposure labels. Then, we propose a method to identify the
function of interest using a strong ignorability assumption and develop an
``Automatic Debiased PUE'' (ADPUE) learning method. This algorithm directly
debiases the selection bias without requiring intermediate estimates, such as
the propensity score, which is necessary for other learning methods. Through
experiments, we demonstrate that our approach outperforms traditional PU
learning methods on various semi-synthetic datasets.
Related papers
- Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem.
We propose a consistent approach that does not rely on the uniform distribution assumption.
We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z) - Robust Positive-Unlabeled Learning via Noise Negative Sample
Self-correction [48.929877651182885]
Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature.
We propose a new robust PU learning method with a training strategy motivated by the nature of human learning.
arXiv Detail & Related papers (2023-08-01T04:34:52Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - Combining Self-labeling with Selective Sampling [2.0305676256390934]
This work combines self-labeling techniques with active learning in a selective sampling scenario.
We show that naive application of self-labeling can harm performance by introducing bias towards selected classes.
The proposed method matches current selective sampling methods or achieves better results.
arXiv Detail & Related papers (2023-01-11T11:58:45Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Classification from Positive and Biased Negative Data with Skewed
Labeled Posterior Probability [0.0]
We propose a new method to approach the positive and biased negative (PbN) classification problem.
We incorporate a method to correct the negative impact due to skewed confidence, which represents the posterior probability that the observed data are positive.
arXiv Detail & Related papers (2022-03-11T04:31:35Z) - Disentangling Sampling and Labeling Bias for Learning in Large-Output
Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels.
We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z) - A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels [49.990938653249415]
This research presents a methodology that assigns initial pseudo-labels to unlabeled data which is used as noisy-labeled data, and trains a deep neural network using the noisy-labeled data.
Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-03-08T11:46:02Z) - Learning from Positive and Unlabeled Data with Arbitrary Positive Shift [11.663072799764542]
This paper shows that PU learning is possible even with arbitrarily non-representative positive data given unlabeled data.
We integrate this into two statistically consistent methods to address arbitrary positive bias.
Experimental results demonstrate our methods' effectiveness across numerous real-world datasets.
arXiv Detail & Related papers (2020-02-24T13:53:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.