Rethinking Class-Prior Estimation for Positive-Unlabeled Learning
- URL: http://arxiv.org/abs/2002.03673v2
- Date: Fri, 3 Jun 2022 07:22:04 GMT
- Title: Rethinking Class-Prior Estimation for Positive-Unlabeled Learning
- Authors: Yu Yao and Tongliang Liu and Bo Han and Mingming Gong and Gang Niu and
Masashi Sugiyama and Dacheng Tao
- Abstract summary: Given only positive (P) and unlabeled (U) data, PU learning can train a binary classifier without any negative data.
Hitherto, the distributional-assumption-free CPE methods rely on a critical assumption that the support of the positive data distribution cannot be contained in the support of the negative data distribution.
We show an affirmative answer by proposing Regrouping CPE (ReCPE) that builds an auxiliary probability distribution such that the support of the positive data distribution is never contained in the support of the negative data distribution.
- Score: 199.51740898051486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given only positive (P) and unlabeled (U) data, PU learning can train a
binary classifier without any negative data. It has two building blocks: PU
class-prior estimation (CPE) and PU classification; the latter has been well
studied while the former has received less attention. Hitherto, the
distributional-assumption-free CPE methods rely on a critical assumption that
the support of the positive data distribution cannot be contained in the
support of the negative data distribution. If this is violated, those CPE
methods will systematically overestimate the class prior; it is even worse that
we cannot verify the assumption based on the data. In this paper, we rethink
CPE for PU learning-can we remove the assumption to make CPE always valid? We
show an affirmative answer by proposing Regrouping CPE (ReCPE) that builds an
auxiliary probability distribution such that the support of the positive data
distribution is never contained in the support of the negative data
distribution. ReCPE can work with any CPE method by treating it as the base
method. Theoretically, ReCPE does not affect its base if the assumption already
holds for the original probability distribution; otherwise, it reduces the
positive bias of its base. Empirically, ReCPE improves all state-of-the-art CPE
methods on various datasets, implying that the assumption has indeed been
violated here.
Related papers
- Three Heads Are Better Than One: Complementary Experts for Long-Tailed Semi-supervised Learning [74.44500692632778]
We propose a novel method named ComPlementary Experts (CPE) to model various class distributions.
CPE achieves state-of-the-art performances on CIFAR-10-LT, CIFAR-100-LT, and STL-10-LT dataset benchmarks.
arXiv Detail & Related papers (2023-12-25T11:54:07Z) - LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond
The True Class [17.12486200215929]
LIPEx is a perturbation-based multi-class explanation framework.
It provides insight into how every feature deemed to be important affects the prediction probability for each of the possible classes.
arXiv Detail & Related papers (2023-10-07T15:31:38Z) - Proposal Distribution Calibration for Few-Shot Object Detection [65.19808035019031]
In few-shot object detection (FSOD), the two-step training paradigm is widely adopted to mitigate the severe sample imbalance.
Unfortunately, the extreme data scarcity aggravates the proposal distribution bias, hindering the RoI head from evolving toward novel classes.
We introduce a simple yet effective proposal distribution calibration (PDC) approach to neatly enhance the localization and classification abilities of the RoI head.
arXiv Detail & Related papers (2022-12-15T05:09:11Z) - Learnable Distribution Calibration for Few-Shot Class-Incremental
Learning [122.2241120474278]
Few-shot class-incremental learning (FSCIL) faces challenges of memorizing old class distributions and estimating new class distributions given few training samples.
We propose a learnable distribution calibration (LDC) approach, with the aim to systematically solve these two challenges using a unified framework.
arXiv Detail & Related papers (2022-10-01T09:40:26Z) - Robust and Efficient Imbalanced Positive-Unlabeled Learning with
Self-supervision [1.5675763601034223]
We present textitImPULSeS, a unified representation learning framework for underlineImbalanced underlinePositive underlineUnlabeled underlineLearning.
We performed different experiments across multiple datasets to show that ImPULSeS is able to halve the error rate of the previous state-of-the-art.
arXiv Detail & Related papers (2022-09-06T12:54:59Z) - Bayes in Wonderland! Predictive supervised classification inference hits
unpredictability [1.8814209805277506]
We show the convergence of the sBpc and mBpc under de Finetti type of exchangeability.
We also provide a parameter estimation of the generative model giving rise to the partition exchangeable sequence.
arXiv Detail & Related papers (2021-12-03T12:34:52Z) - Positive-Unlabeled Classification under Class-Prior Shift: A
Prior-invariant Approach Based on Density Ratio Estimation [85.75352990739154]
We propose a novel PU classification method based on density ratio estimation.
A notable advantage of our proposed method is that it does not require the class-priors in the training phase.
arXiv Detail & Related papers (2021-07-11T13:36:53Z) - Rethinking Ranking-based Loss Functions: Only Penalizing Negative
Instances before Positive Ones is Enough [55.55081785232991]
We argue that only penalizing negative instances before positive ones is enough, because the loss only comes from them.
Instead of following the AP-based loss, we propose a new loss, namely Penalizing Negative instances before Positive ones (PNP)
PNP-D may be more suitable for real-world data, which usually contains several local clusters for one class.
arXiv Detail & Related papers (2021-02-09T04:30:15Z) - Learning from Positive and Unlabeled Data with Arbitrary Positive Shift [11.663072799764542]
This paper shows that PU learning is possible even with arbitrarily non-representative positive data given unlabeled data.
We integrate this into two statistically consistent methods to address arbitrary positive bias.
Experimental results demonstrate our methods' effectiveness across numerous real-world datasets.
arXiv Detail & Related papers (2020-02-24T13:53:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.