K-Nearest Neighbour and Support Vector Machine Hybrid Classification
- URL: http://arxiv.org/abs/2007.00045v1
- Date: Sun, 28 Jun 2020 15:26:56 GMT
- Title: K-Nearest Neighbour and Support Vector Machine Hybrid Classification
- Authors: A. M. Hafiz
- Abstract summary: The technique consists of using K-Nearest Neighbour Classification for test samples satisfying a proximity condition.
For every separated test sample, a Support Vector Machine is trained on the sifted training set patterns associated with it, and classification for the test sample is done.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, a novel K-Nearest Neighbour and Support Vector Machine hybrid
classification technique has been proposed that is simple and robust. It is
based on the concept of discriminative nearest neighbourhood classification.
The technique consists of using K-Nearest Neighbour Classification for test
samples satisfying a proximity condition. The patterns which do not pass the
proximity condition are separated. This is followed by sifting the training set
for a fixed number of patterns for every class which are closest to each
separated test pattern respectively, based on the Euclidean distance metric.
Subsequently, for every separated test sample, a Support Vector Machine is
trained on the sifted training set patterns associated with it, and
classification for the test sample is done. The proposed technique has been
compared to the state of art in this research area. Three datasets viz. the
United States Postal Service (USPS) Handwritten Digit Dataset, MNIST Dataset,
and an Arabic numeral dataset, the Modified Arabic Digits Database, MADB, have
been used to evaluate the performance of the algorithm. The algorithm generally
outperforms the other algorithms with which it has been compared.
Related papers
- Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Large-Scale Open-Set Classification Protocols for ImageNet [0.0]
Open-Set Classification (OSC) intends to adapt closed-set classification models to real-world scenarios.
We propose three open-set protocols that provide rich datasets of natural images with different levels of similarity between known and unknown classes.
We propose a new validation metric that can be employed to assess whether the training of deep learning models addresses both the classification of known samples and the rejection of unknown samples.
arXiv Detail & Related papers (2022-10-13T07:01:34Z) - A k nearest neighbours classifiers ensemble based on extended
neighbourhood rule and features subsets [0.4709844746265484]
kNN based ensemble methods minimise the effect of outliers by identifying a set of data points in the given feature space that are nearest to an unseen observation.
This paper proposes a k nearest neighbour ensemble where the neighbours are determined in k steps.
arXiv Detail & Related papers (2022-05-30T13:57:32Z) - Learning to Hash Naturally Sorts [84.90210592082829]
We introduce Naturally-Sorted Hashing (NSH) to train a deep hashing model with sorted results end-to-end.
NSH sort the Hamming distances of samples' hash codes and accordingly gather their latent representations for self-supervised training.
We describe a novel Sorted Noise-Contrastive Estimation (SortedNCE) loss that selectively picks positive and negative samples for contrastive learning.
arXiv Detail & Related papers (2022-01-31T16:19:02Z) - Overhead-MNIST: Machine Learning Baselines for Image Classification [0.0]
Twenty-three machine learning algorithms were trained then scored to establish baseline comparison metrics.
The Overhead-MNIST dataset is a collection of satellite images similar in style to the ubiquitous MNIST hand-written digits.
We present results for the overall best performing algorithm as a baseline for edge deployability and future performance improvement.
arXiv Detail & Related papers (2021-07-01T13:30:39Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Predictive K-means with local models [0.028675177318965035]
Predictive clustering seeks to obtain the best of the two worlds.
We present two new algorithms using this technique and show on a variety of data sets that they are competitive for prediction performance.
arXiv Detail & Related papers (2020-12-16T10:49:36Z) - A Method for Handling Multi-class Imbalanced Data by Geometry based
Information Sampling and Class Prioritized Synthetic Data Generation (GICaPS) [15.433936272310952]
This paper looks into the problem of handling imbalanced data in a multi-label classification problem.
Two novel methods are proposed that exploit the geometric relationship between the feature vectors.
The efficacy of the proposed methods is analyzed by solving a generic multi-class recognition problem.
arXiv Detail & Related papers (2020-10-11T04:04:26Z) - CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus [62.86856923633923]
We present a robust estimator for fitting multiple parametric models of the same form to noisy measurements.
In contrast to previous works, which resorted to hand-crafted search strategies for multiple model detection, we learn the search strategy from data.
For self-supervised learning of the search, we evaluate the proposed algorithm on multi-homography estimation and demonstrate an accuracy that is superior to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-08T17:37:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.