A Novel Approach to Regularising 1NN classifier for Improved
Generalization
- URL: http://arxiv.org/abs/2402.08405v1
- Date: Tue, 13 Feb 2024 12:09:15 GMT
- Title: A Novel Approach to Regularising 1NN classifier for Improved
Generalization
- Authors: Aditya Challa, Sravan Danda, Laurent Najman
- Abstract summary: We show that watershed classifiers can find arbitrary boundaries on any dense enough dataset, and, at the same time, have very small VC dimension.
We propose a loss function which can learn representations consistent with watershed classifiers, and show that it outperforms the NCA baseline.
- Score: 3.9919322607068293
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we propose a class of non-parametric classifiers, that learn
arbitrary boundaries and generalize well.
Our approach is based on a novel way to regularize 1NN classifiers using a
greedy approach. We refer to this class of classifiers as Watershed
Classifiers. 1NN classifiers are known to trivially over-fit but have very
large VC dimension, hence do not generalize well. We show that watershed
classifiers can find arbitrary boundaries on any dense enough dataset, and, at
the same time, have very small VC dimension; hence a watershed classifier leads
to good generalization.
Traditional approaches to regularize 1NN classifiers are to consider $K$
nearest neighbours. Neighbourhood component analysis (NCA) proposes a way to
learn representations consistent with ($n-1$) nearest neighbour classifier,
where $n$ denotes the size of the dataset. In this article, we propose a loss
function which can learn representations consistent with watershed classifiers,
and show that it outperforms the NCA baseline.
Related papers
- Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Nearest Neighbor Zero-Shot Inference [68.56747574377215]
kNN-Prompt is a technique to use k-nearest neighbor (kNN) retrieval augmentation for zero-shot inference with language models (LMs)
fuzzy verbalizers leverage the sparse kNN distribution for downstream tasks by automatically associating each classification label with a set of natural language tokens.
Experiments show that kNN-Prompt is effective for domain adaptation with no further training, and that the benefits of retrieval increase with the size of the model used for kNN retrieval.
arXiv Detail & Related papers (2022-05-27T07:00:59Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - CondNet: Conditional Classifier for Scene Segmentation [46.62529212678346]
We present a conditional classifier to replace the traditional global classifier.
It attends on the intra-class distinction, leading to stronger dense recognition capability.
The framework equipped with the conditional classifier (called CondNet) achieves new state-of-the-art performances on two datasets.
arXiv Detail & Related papers (2021-09-21T17:19:09Z) - Approximation and generalization properties of the random projection classification method [0.4604003661048266]
We study a family of low-complexity classifiers consisting of thresholding a random one-dimensional feature.
For certain classification problems (e.g., those with a large Rashomon ratio), there is a potntially large gain in generalization properties by selecting parameters at random.
arXiv Detail & Related papers (2021-08-11T23:14:46Z) - Learning and Evaluating Representations for Deep One-class
Classification [59.095144932794646]
We present a two-stage framework for deep one-class classification.
We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations.
In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks.
arXiv Detail & Related papers (2020-11-04T23:33:41Z) - For self-supervised learning, Rationality implies generalization,
provably [13.526562756159809]
We prove a new upper bound on the generalization gap of classifiers obtained by first using self-supervision.
We show that our bound is non-vacuous for many popular representation-learning based classifiers on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2020-10-16T17:07:52Z) - Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance
Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes.
We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution.
Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z) - A Systematic Evaluation: Fine-Grained CNN vs. Traditional CNN
Classifiers [54.996358399108566]
We investigate the performance of the landmark general CNN classifiers, which presented top-notch results on large scale classification datasets.
We compare it against state-of-the-art fine-grained classifiers.
We show an extensive evaluation on six datasets to determine whether the fine-grained classifier is able to elevate the baseline in their experiments.
arXiv Detail & Related papers (2020-03-24T23:49:14Z) - Intrinsic Dimension Estimation via Nearest Constrained Subspace
Classifier [7.028302194243312]
A new subspace based classifier is proposed for supervised classification or intrinsic dimension estimation.
The distribution of the data in each class is modeled by a union of a finite number ofaffine subspaces of the feature space.
The proposed method is a generalisation of classical NN (Nearest Neighbor), NFL (Nearest Feature Line) and has a close relationship to NS (Nearest Subspace)
The proposed classifier with an accurately estimated dimension parameter generally outperforms its competitors in terms of classification accuracy.
arXiv Detail & Related papers (2020-02-08T20:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.