Pointwise Binary Classification with Pairwise Confidence Comparisons
- URL: http://arxiv.org/abs/2010.01875v4
- Date: Thu, 13 Jan 2022 08:59:36 GMT
- Title: Pointwise Binary Classification with Pairwise Confidence Comparisons
- Authors: Lei Feng, Senlin Shu, Nan Lu, Bo Han, Miao Xu, Gang Niu, Bo An,
Masashi Sugiyama
- Abstract summary: We propose pairwise comparison (Pcomp) classification, where we have only pairs of unlabeled data that we know one is more likely to be positive than the other.
We link Pcomp classification to noisy-label learning to develop a progressive URE and improve it by imposing consistency regularization.
- Score: 97.79518780631457
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To alleviate the data requirement for training effective binary classifiers
in binary classification, many weakly supervised learning settings have been
proposed. Among them, some consider using pairwise but not pointwise labels,
when pointwise labels are not accessible due to privacy, confidentiality, or
security reasons. However, as a pairwise label denotes whether or not two data
points share a pointwise label, it cannot be easily collected if either point
is equally likely to be positive or negative. Thus, in this paper, we propose a
novel setting called pairwise comparison (Pcomp) classification, where we have
only pairs of unlabeled data that we know one is more likely to be positive
than the other. Firstly, we give a Pcomp data generation process, derive an
unbiased risk estimator (URE) with theoretical guarantee, and further improve
URE using correction functions. Secondly, we link Pcomp classification to
noisy-label learning to develop a progressive URE and improve it by imposing
consistency regularization. Finally, we demonstrate by experiments the
effectiveness of our methods, which suggests Pcomp is a valuable and
practically useful type of pairwise supervision besides the pairwise label.
Related papers
- Data-Driven Estimation of the False Positive Rate of the Bayes Binary
Classifier via Soft Labels [25.40796153743837]
We propose an estimator for the false positive rate (FPR) of the Bayes classifier, that is, the optimal classifier with respect to accuracy, from a given dataset.
We develop effective FPR estimators by leveraging a denoising technique and the Nadaraya-Watson estimator.
arXiv Detail & Related papers (2024-01-27T20:41:55Z) - Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression [57.17120203327993]
threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label.
In nature, regression also requires unbiased methods to generate high-quality labels.
We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
arXiv Detail & Related papers (2023-11-03T08:39:35Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification.
We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate.
We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z) - GaussianMLR: Learning Implicit Class Significance via Calibrated
Multi-Label Ranking [0.0]
We propose a novel multi-label ranking method: GaussianMLR.
It aims to learn implicit class significance values that determine the positive label ranks.
We show that our method is able to accurately learn a representation of the incorporated positive rank order.
arXiv Detail & Related papers (2023-03-07T14:09:08Z) - Boosting Semi-Supervised Learning with Contrastive Complementary
Labeling [11.851898765002334]
A popular approach is pseudo-labeling that generates pseudo labels only for those unlabeled data with high-confidence predictions.
We highlight that data with low-confidence pseudo labels can be still beneficial to the training process.
Inspired by this, we propose a novel Contrastive Complementary Labeling (CCL) method that constructs a large number of reliable negative pairs.
arXiv Detail & Related papers (2022-12-13T15:25:49Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels [98.13491369929798]
We propose a framework called Class2Simi, which transforms data points with noisy class labels to data pairs with noisy similarity labels.
Class2Simi is computationally efficient because not only this transformation is on-the-fly in mini-batches, but also it just changes loss on top of model prediction into a pairwise manner.
arXiv Detail & Related papers (2020-06-14T07:55:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.