Related papers: Cautious Active Clustering

Cautious Active Clustering

URL: http://arxiv.org/abs/2008.01245v2
Date: Tue, 8 Dec 2020 03:41:49 GMT
Title: Cautious Active Clustering
Authors: Alexander Cloninger, Hrushikesh Mhaskar
Abstract summary: We consider the problem of classification of points sampled from an unknown probability measure on a Euclidean space. Our approach is to consider the unknown probability measure as a convex combination of the conditional probabilities for each class.
Score: 79.23797234241471
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of classification of points sampled from an unknown probability measure on a Euclidean space. We study the question of querying the class label at a very small number of judiciously chosen points so as to be able to attach the appropriate class label to every point in the set. Our approach is to consider the unknown probability measure as a convex combination of the conditional probabilities for each class. Our technique involves the use of a highly localized kernel constructed from Hermite polynomials, in order to create a hierarchical estimate of the supports of the constituent probability measures. We do not need to make any assumptions on the nature of any of the probability measures nor know in advance the number of classes involved. We give theoretical guarantees measured by the $F$-score for our classification scheme. Examples include classification in hyper-spectral images and MNIST classification.

Related papers

Probability-density-aware Semi-supervised Learning [44.91442162204045]
Semi-supervised learning assumes that neighbor points lie in the same category (neighbor assumption), and points in different clusters belong to various categories (cluster assumption) Existing methods usually rely on similarity measures to retrieve the similar neighbor points, ignoring cluster assumption. This paper first provides a systematical investigation into the role of probability density in SSL and lays a solid theoretical foundation for cluster assumption.
arXiv Detail & Related papers (2024-12-23T13:08:23Z)
Can Class-Priors Help Single-Positive Multi-Label Learning? [40.312419865957224]
Single-positive multi-label learning (SPMLL) is a typical weakly supervised multi-label learning problem. Class-priors estimator is introduced, which could estimate the class-priors that are theoretically guaranteed to converge to the ground-truth class-priors. Based on the estimated class-priors, an unbiased risk estimator for classification is derived, and the corresponding risk minimizer could be guaranteed to approximately converge to the optimal risk minimizer on fully supervised data.
arXiv Detail & Related papers (2023-09-25T05:45:57Z)
A Universal Unbiased Method for Classification from Aggregate Observations [115.20235020903992]
This paper presents a novel universal method of CFAO, which holds an unbiased estimator of the classification risk for arbitrary losses. Our proposed method not only guarantees the risk consistency due to the unbiased risk estimator but also can be compatible with arbitrary losses.
arXiv Detail & Related papers (2023-06-20T07:22:01Z)
Class-Conditional Conformal Prediction with Many Classes [60.8189977620604]
We propose a method called clustered conformal prediction that clusters together classes having "similar" conformal scores. We find that clustered conformal typically outperforms existing methods in terms of class-conditional coverage and set size metrics.
arXiv Detail & Related papers (2023-06-15T17:59:02Z)
Parametric Classification for Generalized Category Discovery: A Baseline Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z)
When in Doubt: Improving Classification Performance with Alternating Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification. CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution. We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z)
Classifier uncertainty: evidence, potential impact, and probabilistic treatment [0.0]
We present an approach to quantify the uncertainty of classification performance metrics based on a probability model of the confusion matrix. We show that uncertainties can be surprisingly large and limit performance evaluation.
arXiv Detail & Related papers (2020-06-19T12:49:19Z)
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks. We present a unifying view of randomized smoothing over arbitrary functions. We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.