Advanced kNN: A Mature Machine Learning Series
- URL: http://arxiv.org/abs/2003.00415v1
- Date: Sun, 1 Mar 2020 06:11:04 GMT
- Title: Advanced kNN: A Mature Machine Learning Series
- Authors: Muhammad Asim and Muaaz Zakria
- Abstract summary: k-nearest neighbour (kNN) is one of the most prominent, simple and basic algorithm used in machine learning and data mining.
The purpose of this paper is to suggest an Advanced kNN (A-kNN) algorithm that will be able to classify an instance as unknown.
- Score: 2.7729939137633877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: k-nearest neighbour (kNN) is one of the most prominent, simple and basic
algorithm used in machine learning and data mining. However, kNN has limited
prediction ability, i.e., kNN cannot predict any instance correctly if it does
not belong to any of the predefined classes in the training data set. The
purpose of this paper is to suggest an Advanced kNN (A-kNN) algorithm that will
be able to classify an instance as unknown, after verifying that it does not
belong to any of the predefined classes. Performance of kNN and A-kNN is
compared on three different data sets namely iris plant data set, BUPA liver
disorder data set, and Alpha Beta detection data set. Results of A-kNN are
significantly accurate for detecting unknown instances.
Related papers
- INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation [57.952478914459164]
kNN-MT has provided an effective paradigm to smooth the prediction based on neighbor representations during inference.
We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.
Experiments on four benchmark datasets show that method achieves average gains of 1.99 COMET and 1.0 BLEU, outperforming the state-of-the-art kNN-MT system with 0.02x memory space and 1.9x inference speedup.
arXiv Detail & Related papers (2023-06-10T08:39:16Z) - Less is More: Parameter-Free Text Classification with Gzip [47.63077023698568]
Deep neural networks (DNNs) are often used for text classification tasks as they usually achieve high levels of accuracy.
We propose a non-parametric alternative to DNNs that's easy, light-weight and universal in text classification.
Our method achieves results that are competitive with non-pretrained deep learning methods on six in-distributed datasets.
arXiv Detail & Related papers (2022-12-19T12:40:18Z) - You Can Have Better Graph Neural Networks by Not Training Weights at
All: Finding Untrained GNNs Tickets [105.24703398193843]
Untrainedworks in graph neural networks (GNNs) still remains mysterious.
We show that the found untrainedworks can substantially mitigate the GNN over-smoothing problem.
We also observe that such sparse untrainedworks have appealing performance in out-of-distribution detection and robustness of input perturbations.
arXiv Detail & Related papers (2022-11-28T14:17:36Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z) - Evaluating Deep Neural Network Ensembles by Majority Voting cum
Meta-Learning scheme [3.351714665243138]
We propose an ensemble of seven independent Deep Neural Networks (DNNs) for a new data instance.
One-seventh of the data is deleted and replenished by bootstrap sampling from the remaining samples.
All the algorithms in this paper have been tested on five benchmark datasets.
arXiv Detail & Related papers (2021-05-09T03:10:56Z) - Ranking and Rejecting of Pre-Trained Deep Neural Networks in Transfer
Learning based on Separation Index [0.16058099298620418]
We introduce an algorithm to rank pre-trained Deep Neural Networks (DNNs) by applying a distance-based complexity measure named Separation Index (SI) to the target dataset.
The efficiency of the proposed algorithm is evaluated by using three challenging datasets including Linnaeus 5, Breast Cancer Images, and COVID-CT.
arXiv Detail & Related papers (2020-12-26T11:14:12Z) - One Versus all for deep Neural Network Incertitude (OVNNI)
quantification [12.734278426543332]
We propose a new technique to quantify the epistemic uncertainty of data easily.
This method consists in mixing the predictions of an ensemble of DNNs trained to classify One class vs All the other classes (OVA) with predictions from a standard DNN trained to perform All vs All (AVA) classification.
arXiv Detail & Related papers (2020-06-01T14:06:12Z) - Computing the Testing Error without a Testing Set [33.068870286618655]
We derive an algorithm to estimate the performance gap between training and testing that does not require any testing dataset.
This allows us to compute the DNN's testing error on unseen samples, even when we do not have access to them.
arXiv Detail & Related papers (2020-05-01T15:35:50Z) - A new hashing based nearest neighbors selection technique for big
datasets [14.962398031252063]
This paper proposes a new technique that enables the selection of nearest neighbors directly in the neighborhood of a given observation.
The proposed approach consists of dividing the data space into subcells of a virtual grid built on top of data space.
Our algorithm outperforms the original KNN in time efficiency with a prediction quality as good as that of KNN.
arXiv Detail & Related papers (2020-04-05T19:36:00Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.