Cautious Active Clustering
- URL: http://arxiv.org/abs/2008.01245v2
- Date: Tue, 8 Dec 2020 03:41:49 GMT
- Title: Cautious Active Clustering
- Authors: Alexander Cloninger, Hrushikesh Mhaskar
- Abstract summary: We consider the problem of classification of points sampled from an unknown probability measure on a Euclidean space.
Our approach is to consider the unknown probability measure as a convex combination of the conditional probabilities for each class.
- Score: 79.23797234241471
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of classification of points sampled from an unknown
probability measure on a Euclidean space. We study the question of querying the
class label at a very small number of judiciously chosen points so as to be
able to attach the appropriate class label to every point in the set. Our
approach is to consider the unknown probability measure as a convex combination
of the conditional probabilities for each class. Our technique involves the use
of a highly localized kernel constructed from Hermite polynomials, in order to
create a hierarchical estimate of the supports of the constituent probability
measures. We do not need to make any assumptions on the nature of any of the
probability measures nor know in advance the number of classes involved. We
give theoretical guarantees measured by the $F$-score for our classification
scheme. Examples include classification in hyper-spectral images and MNIST
classification.
Related papers
- Can Class-Priors Help Single-Positive Multi-Label Learning? [40.312419865957224]
Single-positive multi-label learning (SPMLL) is a typical weakly supervised multi-label learning problem.
Class-priors estimator is introduced, which could estimate the class-priors that are theoretically guaranteed to converge to the ground-truth class-priors.
Based on the estimated class-priors, an unbiased risk estimator for classification is derived, and the corresponding risk minimizer could be guaranteed to approximately converge to the optimal risk minimizer on fully supervised data.
arXiv Detail & Related papers (2023-09-25T05:45:57Z) - A Universal Unbiased Method for Classification from Aggregate
Observations [115.20235020903992]
This paper presents a novel universal method of CFAO, which holds an unbiased estimator of the classification risk for arbitrary losses.
Our proposed method not only guarantees the risk consistency due to the unbiased risk estimator but also can be compatible with arbitrary losses.
arXiv Detail & Related papers (2023-06-20T07:22:01Z) - Class-Conditional Conformal Prediction with Many Classes [60.8189977620604]
We propose a method called clustered conformal prediction that clusters together classes having "similar" conformal scores.
We find that clustered conformal typically outperforms existing methods in terms of class-conditional coverage and set size metrics.
arXiv Detail & Related papers (2023-06-15T17:59:02Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - When in Doubt: Improving Classification Performance with Alternating
Normalization [57.39356691967766]
We introduce Classification with Alternating Normalization (CAN), a non-parametric post-processing step for classification.
CAN improves classification accuracy for challenging examples by re-adjusting their predicted class probability distribution.
We empirically demonstrate its effectiveness across a diverse set of classification tasks.
arXiv Detail & Related papers (2021-09-28T02:55:42Z) - Classifier uncertainty: evidence, potential impact, and probabilistic
treatment [0.0]
We present an approach to quantify the uncertainty of classification performance metrics based on a probability model of the confusion matrix.
We show that uncertainties can be surprisingly large and limit performance evaluation.
arXiv Detail & Related papers (2020-06-19T12:49:19Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.