Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification
- URL: http://arxiv.org/abs/2102.00678v1
- Date: Mon, 1 Feb 2021 07:36:38 GMT
- Title: Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification
- Authors: Shida Lei, Nan Lu, Gang Niu, Issei Sato, Masashi Sugiyama
- Abstract summary: We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
- Score: 94.55805516167369
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To cope with high annotation costs, training a classifier only from weakly
supervised data has attracted a great deal of attention these days. Among
various approaches, strengthening supervision from completely unsupervised
classification is a promising direction, which typically employs class priors
as the only supervision and trains a binary classifier from unlabeled (U)
datasets. While existing risk-consistent methods are theoretically grounded
with high flexibility, they can learn only from two U sets. In this paper, we
propose a new approach for binary classification from m U-sets for $m\ge2$. Our
key idea is to consider an auxiliary classification task called surrogate set
classification (SSC), which is aimed at predicting from which U set each
observed data is drawn. SSC can be solved by a standard (multi-class)
classification method, and we use the SSC solution to obtain the final binary
classifier through a certain linear-fractional transformation. We built our
method in a flexible and efficient end-to-end deep learning framework and prove
it to be classifier-consistent. Through experiments, we demonstrate the
superiority of our proposed method over state-of-the-art methods.
Related papers
- Anomaly Detection using Ensemble Classification and Evidence Theory [62.997667081978825]
We present a novel approach for novel detection using ensemble classification and evidence theory.
A pool selection strategy is presented to build a solid ensemble classifier.
We use uncertainty for the anomaly detection approach.
arXiv Detail & Related papers (2022-12-23T00:50:41Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Multiple Classifiers Based Maximum Classifier Discrepancy for
Unsupervised Domain Adaptation [25.114533037440896]
We propose to extend the structure of two classifiers to multiple classifiers to further boost its performance.
We demonstrate that, on average, adopting the structure of three classifiers normally yields the best performance as a trade-off between the accuracy and efficiency.
arXiv Detail & Related papers (2021-08-02T03:00:13Z) - SetConv: A New Approach for Learning from Imbalanced Data [29.366843553056594]
We propose a set convolution operation and an episodic training strategy to extract a single representative for each class.
We prove that our proposed algorithm is permutation-invariant despite the order of inputs.
arXiv Detail & Related papers (2021-04-03T22:33:30Z) - Learning and Evaluating Representations for Deep One-class
Classification [59.095144932794646]
We present a two-stage framework for deep one-class classification.
We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations.
In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks.
arXiv Detail & Related papers (2020-11-04T23:33:41Z) - Global Multiclass Classification and Dataset Construction via
Heterogeneous Local Experts [37.27708297562079]
We show how to minimize the number of labelers while ensuring the reliability of the resulting dataset.
Experiments with the MNIST and CIFAR-10 datasets demonstrate the favorable accuracy of our aggregation scheme.
arXiv Detail & Related papers (2020-05-21T18:07:42Z) - A Classification-Based Approach to Semi-Supervised Clustering with
Pairwise Constraints [5.639904484784126]
We introduce a network framework for semi-supervised clustering with pairwise constraints.
In contrast to existing approaches, we decompose SSC into two simpler classification tasks/stages.
The proposed approach, S3C2, is motivated by the observation that binary classification is usually easier than multi-class clustering.
arXiv Detail & Related papers (2020-01-18T20:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.