Related papers: A k nearest neighbours classifiers ensemble based on extended neighbourhood rule and features subsets

A k nearest neighbours classifiers ensemble based on extended neighbourhood rule and features subsets

URL: http://arxiv.org/abs/2205.15111v1
Date: Mon, 30 May 2022 13:57:32 GMT
Title: A k nearest neighbours classifiers ensemble based on extended neighbourhood rule and features subsets
Authors: Amjad Ali, Muhammad Hamraz, Naz Gul, Dost Muhammad Khan, Zardad Khan, Saeed Aldahmani
Abstract summary: kNN based ensemble methods minimise the effect of outliers by identifying a set of data points in the given feature space that are nearest to an unseen observation. This paper proposes a k nearest neighbour ensemble where the neighbours are determined in k steps.
Score: 0.4709844746265484
License: http://creativecommons.org/licenses/by/4.0/
Abstract: kNN based ensemble methods minimise the effect of outliers by identifying a set of data points in the given feature space that are nearest to an unseen observation in order to predict its response by using majority voting. The ordinary ensembles based on kNN find out the k nearest observations in a region (bounded by a sphere) based on a predefined value of k. This scenario, however, might not work in situations when the test observation follows the pattern of the closest data points with the same class that lie on a certain path not contained in the given sphere. This paper proposes a k nearest neighbour ensemble where the neighbours are determined in k steps. Starting from the first nearest observation of the test point, the algorithm identifies a single observation that is closest to the observation at the previous step. At each base learner in the ensemble, this search is extended to k steps on a random bootstrap sample with a random subset of features selected from the feature space. The final predicted class of the test point is determined by using a majority vote in the predicted classes given by all base models. This new ensemble method is applied on 17 benchmark datasets and compared with other classical methods, including kNN based models, in terms of classification accuracy, kappa and Brier score as performance metrics. Boxplots are also utilised to illustrate the difference in the results given by the proposed and other state-of-the-art methods. The proposed method outperformed the rest of the classical methods in the majority of cases. The paper gives a detailed simulation study for further assessment.

Related papers

Explaining the Success of Nearest Neighbor Methods in Prediction [20.63799450632279]
Methods for prediction leverage nearest neighbor search to find past training examples most similar to a test example. This book aims to explain the success of these methods, both in theory and in practice.
arXiv Detail & Related papers (2025-02-21T19:37:57Z)
Class-Conditional Conformal Prediction with Many Classes [60.8189977620604]
We propose a method called clustered conformal prediction that clusters together classes having "similar" conformal scores. We find that clustered conformal typically outperforms existing methods in terms of class-conditional coverage and set size metrics.
arXiv Detail & Related papers (2023-06-15T17:59:02Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
A Random Projection k Nearest Neighbours Ensemble for Classification via Extended Neighbourhood Rule [0.5052937880533719]
Ensembles based on k nearest neighbours (kNN) combine a large number of base learners. RPExNRule ensemble is proposed where bootstrap samples from the given training data are randomly projected into lower dimensions.
arXiv Detail & Related papers (2023-03-21T21:58:59Z)
Optimal Extended Neighbourhood Rule $k$ Nearest Neighbours Ensemble [1.8843687952462742]
A new optimal extended neighborhood rule based ensemble method is proposed in this paper. The ensemble is compared with state-of-the-art methods on 17 benchmark datasets using accuracy, Cohen's kappa, and Brier score (BS)
arXiv Detail & Related papers (2022-11-21T09:13:54Z)
An enhanced method of initial cluster center selection for K-means algorithm [0.0]
We propose a novel approach to improve initial cluster selection for K-means algorithm. The Convex Hull algorithm facilitates the computing of the first two centroids and the remaining ones are selected according to the distance from previously selected centers. We obtained only 7.33%, 7.90%, and 0% clustering error in Iris, Letter, and Ruspini data respectively.
arXiv Detail & Related papers (2022-10-18T00:58:50Z)
Gradient Based Clustering [72.15857783681658]
We propose a general approach for distance based clustering, using the gradient of the cost function that measures clustering quality. The approach is an iterative two step procedure (alternating between cluster assignment and cluster center updates) and is applicable to a wide range of functions.
arXiv Detail & Related papers (2022-02-01T19:31:15Z)
Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z)
Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models. We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z)
K-Nearest Neighbour and Support Vector Machine Hybrid Classification [0.0]
The technique consists of using K-Nearest Neighbour Classification for test samples satisfying a proximity condition. For every separated test sample, a Support Vector Machine is trained on the sifted training set patterns associated with it, and classification for the test sample is done.
arXiv Detail & Related papers (2020-06-28T15:26:56Z)
Clustering Binary Data by Application of Combinatorial Optimization Heuristics [52.77024349608834]
We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters. Five new and original methods are introduced, using neighborhoods and population behavior optimization metaheuristics. From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM.
arXiv Detail & Related papers (2020-01-06T23:33:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.