Instance-based learning using the Half-Space Proximal Graph
- URL: http://arxiv.org/abs/2102.02755v1
- Date: Thu, 4 Feb 2021 17:36:59 GMT
- Title: Instance-based learning using the Half-Space Proximal Graph
- Authors: Ariana Talamantes and Edgar Chavez
- Abstract summary: This paper presents a parameter-free instance-based learning algorithm using the em Half-Space Proximal (HSP) graph.
The resulting classifier bettered $KNN$ for any $k$ in a battery of datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The primary example of instance-based learning is the $k$-nearest neighbor
rule (kNN), praised for its simplicity and the capacity to adapt to new unseen
data and toss away old data. The main disadvantages often mentioned are the
classification complexity, which is $O(n)$, and the estimation of the parameter
$k$, the number of nearest neighbors to be used. The use of indexes at
classification time lifts the former disadvantage, while there is no conclusive
method for the latter.
This paper presents a parameter-free instance-based learning algorithm using
the {\em Half-Space Proximal} (HSP) graph. The HSP neighbors simultaneously
possess proximity and variety concerning the center node. To classify a given
query, we compute its HSP neighbors and apply a simple majority rule over them.
In our experiments, the resulting classifier bettered $KNN$ for any $k$ in a
battery of datasets. This improvement sticks even when applying weighted
majority rules to both kNN and HSP classifiers.
Surprisingly, when using a probabilistic index to approximate the HSP graph
and consequently speeding-up the classification task, our method could {\em
improve} its accuracy in stark contrast with the kNN classifier, which worsens
with a probabilistic index.
Related papers
- Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - Resource saving taxonomy classification with k-mer distributions and
machine learning [2.0196229393131726]
We propose to use $k$-mer distributions obtained from DNA as features to classify its taxonomic origin.
We show that our approach improves the classification on the genus level and achieves comparable results for the superkingdom and phylum level.
arXiv Detail & Related papers (2023-03-10T08:01:08Z) - Systematically improving existing k-means initialization algorithms at
nearly no cost, by pairwise-nearest-neighbor smoothing [1.2570180539670577]
We present a meta-method for initializing the $k$-means clustering algorithm called PNN-smoothing.
It consists in splitting a given dataset into $J$ random subsets, clustering each of them individually, and merging the resulting clusterings with the pairwise-nearest-neighbor method.
arXiv Detail & Related papers (2022-02-08T15:56:30Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z) - IRLI: Iterative Re-partitioning for Learning to Index [104.72641345738425]
Methods have to trade between obtaining high accuracy while maintaining load balance and scalability in distributed settings.
We propose a novel approach called IRLI, which iteratively partitions the items by learning the relevant buckets directly from the query-item relevance data.
We mathematically show that IRLI retrieves the correct item with high probability under very natural assumptions and provides superior load balancing.
arXiv Detail & Related papers (2021-03-17T23:13:25Z) - Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on
Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models.
We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z) - Cluster-and-Conquer: When Randomness Meets Graph Locality [0.0]
Some of the most efficient KNN graph algorithms are incremental and local.
Cluster-and-Conquer boosts the starting configuration of greedy algorithms.
arXiv Detail & Related papers (2020-10-22T07:31:12Z) - OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax
Layer [77.90012156266324]
This paper aims to find a subspace of neural networks that can facilitate a large decision margin.
We propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain during both the training and test processes.
Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets.
arXiv Detail & Related papers (2020-04-20T02:41:01Z) - A new hashing based nearest neighbors selection technique for big
datasets [14.962398031252063]
This paper proposes a new technique that enables the selection of nearest neighbors directly in the neighborhood of a given observation.
The proposed approach consists of dividing the data space into subcells of a virtual grid built on top of data space.
Our algorithm outperforms the original KNN in time efficiency with a prediction quality as good as that of KNN.
arXiv Detail & Related papers (2020-04-05T19:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.