Information Modified K-Nearest Neighbor
- URL: http://arxiv.org/abs/2312.01991v2
- Date: Tue, 14 May 2024 11:59:30 GMT
- Title: Information Modified K-Nearest Neighbor
- Authors: Mohammad Ali Vahedifar, Azim Akhtarshenas, Maryam Sabbaghian, Mohammad Mohammadi Rafatpanah, Ramin Toosi,
- Abstract summary: K-Nearest Neighbors (KNN) is the classification of samples based on the majority through their nearest neighbors.
Many KNN methodologies introduce complex algorithms that do not significantly outperform the traditional KNN.
We present a proposed information-modified KNN (IMKNN) to improve the performance of the KNN algorithm.
We conduct experiments on 12 widely-used datasets, achieving 11.05%, 12.42%, and 12.07% in accuracy, precision, and recall performance, respectively.
- Score: 4.916646834691489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The fundamental concept underlying K-Nearest Neighbors (KNN) is the classification of samples based on the majority through their nearest neighbors. Although distance and neighbors' labels are critical in KNN, traditional KNN treats all samples equally. However, some KNN variants weigh neighbors differently based on a specific rule, considering each neighbor's distance and label. Many KNN methodologies introduce complex algorithms that do not significantly outperform the traditional KNN, often leading to less satisfactory outcomes. The gap in reliably extracting information for accurately predicting true weights remains an open research challenge. In our proposed method, information-modified KNN (IMKNN), we bridge the gap by presenting a straightforward algorithm that achieves effective results. To this end, we introduce a classification method to improve the performance of the KNN algorithm. By exploiting mutual information (MI) and incorporating ideas from Shapley's values, we improve the traditional KNN performance in accuracy, precision, and recall, offering a more refined and effective solution. To evaluate the effectiveness of our method, it is compared with eight variants of KNN. We conduct experiments on 12 widely-used datasets, achieving 11.05\%, 12.42\%, and 12.07\% in accuracy, precision, and recall performance, respectively, compared to traditional KNN. Additionally, we compared IMKNN with traditional KNN across four large-scale datasets to highlight the distinct advantages of IMKNN in the impact of monotonicity, noise, density, subclusters, and skewed distributions. Our research indicates that IMKNN consistently surpasses other methods in diverse datasets.
Related papers
- A Novel Pseudo Nearest Neighbor Classification Method Using Local Harmonic Mean Distance [0.0]
This article introduces a novel KNN-based classification method called LMPHNN.
LMPHNN improves classification performance based on LMPNN rules and HMD.
It achieves an average precision of 97%, surpassing other methods by 14%.
arXiv Detail & Related papers (2024-05-10T04:13:07Z) - INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation [57.952478914459164]
kNN-MT has provided an effective paradigm to smooth the prediction based on neighbor representations during inference.
We propose an effective training framework INK to directly smooth the representation space via adjusting representations of kNN neighbors with a small number of new parameters.
Experiments on four benchmark datasets show that method achieves average gains of 1.99 COMET and 1.0 BLEU, outperforming the state-of-the-art kNN-MT system with 0.02x memory space and 1.9x inference speedup.
arXiv Detail & Related papers (2023-06-10T08:39:16Z) - Nearest Neighbor Zero-Shot Inference [68.56747574377215]
kNN-Prompt is a technique to use k-nearest neighbor (kNN) retrieval augmentation for zero-shot inference with language models (LMs)
fuzzy verbalizers leverage the sparse kNN distribution for downstream tasks by automatically associating each classification label with a set of natural language tokens.
Experiments show that kNN-Prompt is effective for domain adaptation with no further training, and that the benefits of retrieval increase with the size of the model used for kNN retrieval.
arXiv Detail & Related papers (2022-05-27T07:00:59Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier [61.063988689601416]
Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss.
These problems can be improved by learning representations that focus on similarities in the same class and contradictions when making predictions.
We introduce the KNearest Neighbors in pre-trained model fine-tuning tasks in this paper.
arXiv Detail & Related papers (2021-10-06T06:17:05Z) - Adaptive Nearest Neighbor Machine Translation [60.97183408140499]
kNN-MT combines pre-trained neural machine translation with token-level k-nearest-neighbor retrieval.
Traditional kNN algorithm simply retrieves a same number of nearest neighbors for each target token.
We propose Adaptive kNN-MT to dynamically determine the number of k for each target token.
arXiv Detail & Related papers (2021-05-27T09:27:42Z) - Evaluating Deep Neural Network Ensembles by Majority Voting cum
Meta-Learning scheme [3.351714665243138]
We propose an ensemble of seven independent Deep Neural Networks (DNNs) for a new data instance.
One-seventh of the data is deleted and replenished by bootstrap sampling from the remaining samples.
All the algorithms in this paper have been tested on five benchmark datasets.
arXiv Detail & Related papers (2021-05-09T03:10:56Z) - KNN-enhanced Deep Learning Against Noisy Labels [4.765948508271371]
Supervised learning on Deep Neural Networks (DNNs) is data hungry.
In this work, we propose to apply deep KNN for label cleanup.
We iteratively train the neural network and update labels to simultaneously proceed towards higher label recovery rate and better classification performance.
arXiv Detail & Related papers (2020-12-08T05:21:29Z) - A new hashing based nearest neighbors selection technique for big
datasets [14.962398031252063]
This paper proposes a new technique that enables the selection of nearest neighbors directly in the neighborhood of a given observation.
The proposed approach consists of dividing the data space into subcells of a virtual grid built on top of data space.
Our algorithm outperforms the original KNN in time efficiency with a prediction quality as good as that of KNN.
arXiv Detail & Related papers (2020-04-05T19:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.