Structure of Classifier Boundaries: Case Study for a Naive Bayes
Classifier
- URL: http://arxiv.org/abs/2212.04382v2
- Date: Fri, 9 Feb 2024 16:48:37 GMT
- Title: Structure of Classifier Boundaries: Case Study for a Naive Bayes
Classifier
- Authors: Alan F. Karr, Zac Bowen, Adam A. Porter
- Abstract summary: We show that the boundary is both large and complicated in structure.
We create a new measure of uncertainty, called Neighbor Similarity, that compares the result for a point to the distribution of results for its neighbors.
- Score: 1.1485218363676564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Whether based on models, training data or a combination, classifiers place
(possibly complex) input data into one of a relatively small number of output
categories. In this paper, we study the structure of the boundary--those points
for which a neighbor is classified differently--in the context of an input
space that is a graph, so that there is a concept of neighboring inputs, The
scientific setting is a model-based naive Bayes classifier for DNA reads
produced by Next Generation Sequencers. We show that the boundary is both large
and complicated in structure. We create a new measure of uncertainty, called
Neighbor Similarity, that compares the result for a point to the distribution
of results for its neighbors. This measure not only tracks two inherent
uncertainty measures for the Bayes classifier, but also can be implemented, at
a computational cost, for classifiers without inherent measures of uncertainty.
Related papers
- Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - DNA: Denoised Neighborhood Aggregation for Fine-grained Category
Discovery [25.836440772705505]
We propose a self-supervised framework that encodes semantic structures of data into the embedding space.
We retrieve k-nearest neighbors of a query as its positive keys to capture semantic similarities between data and then aggregate information from the neighbors to learn compact cluster representations.
Our method can retrieve more accurate neighbors (21.31% accuracy improvement) and outperform state-of-the-art models by a large margin.
arXiv Detail & Related papers (2023-10-16T07:43:30Z) - An Upper Bound for the Distribution Overlap Index and Its Applications [18.481370450591317]
This paper proposes an easy-to-compute upper bound for the overlap index between two probability distributions.
The proposed bound shows its value in one-class classification and domain shift analysis.
Our work shows significant promise toward broadening the applications of overlap-based metrics.
arXiv Detail & Related papers (2022-12-16T20:02:03Z) - Neighbour Consistency Guided Pseudo-Label Refinement for Unsupervised
Person Re-Identification [80.98291772215154]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features for person retrieval without any annotations.
Recent advances accomplish this task by leveraging clustering-based pseudo labels.
We propose a Neighbour Consistency guided Pseudo Label Refinement framework.
arXiv Detail & Related papers (2022-11-30T09:39:57Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Robust-by-Design Classification via Unitary-Gradient Neural Networks [66.17379946402859]
The use of neural networks in safety-critical systems requires safe and robust models, due to the existence of adversarial attacks.
Knowing the minimal adversarial perturbation of any input x, or, equivalently, the distance of x from the classification boundary, allows evaluating the classification robustness, providing certifiable predictions.
A novel network architecture named Unitary-Gradient Neural Network is presented.
Experimental results show that the proposed architecture approximates a signed distance, hence allowing an online certifiable classification of x at the cost of a single inference.
arXiv Detail & Related papers (2022-09-09T13:34:51Z) - Smoothed Embeddings for Certified Few-Shot Learning [63.68667303948808]
We extend randomized smoothing to few-shot learning models that map inputs to normalized embeddings.
Our results are confirmed by experiments on different datasets.
arXiv Detail & Related papers (2022-02-02T18:19:04Z) - Scalable Optimal Classifiers for Adversarial Settings under Uncertainty [10.90668635921398]
We consider the problem of finding optimal classifiers in an adversarial setting where the class-1 data is generated by an attacker whose objective is not known to the defender.
We show that this low-dimensional characterization enables to develop a training method to compute provably approximately optimal classifiers in a scalable manner.
arXiv Detail & Related papers (2021-06-28T13:33:53Z) - A new class of generative classifiers based on staged tree models [2.66269503676104]
Generative models for classification use the joint probability distribution of the class variable and the features to construct a decision rule.
Here we introduce a new class of generative classifiers, called staged tree classifiers, which formally account for context-specific independence.
An applied analysis to predict the fate of the passengers of the Titanic highlights the insights that the new class of generative classifiers can give.
arXiv Detail & Related papers (2020-12-26T19:30:35Z) - Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on
Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models.
We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.