Social Distancing is Good for Points too!
- URL: http://arxiv.org/abs/2006.15650v1
- Date: Sun, 28 Jun 2020 16:49:59 GMT
- Title: Social Distancing is Good for Points too!
- Authors: Alejandro Flores-Velazco
- Abstract summary: We show that FCNN can behave poorly when points are too close to each other.
We then successfully modify the algorithm to avoid such cases.
This modification is sufficient to prove useful upper-bounds, along with approximation guarantees for the algorithm.
- Score: 91.3755431537592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The nearest-neighbor rule is a well-known classification technique that,
given a training set P of labeled points, classifies any unlabeled query point
with the label of its closest point in P. The nearest-neighbor condensation
problem aims to reduce the training set without harming the accuracy of the
nearest-neighbor rule.
FCNN is the most popular algorithm for condensation. It is heuristic in
nature, and theoretical results for it are scarce. In this paper, we settle the
question of whether reasonable upper-bounds can be proven for the size of the
subset selected by FCNN. First, we show that the algorithm can behave poorly
when points are too close to each other, forcing it to select many more points
than necessary. We then successfully modify the algorithm to avoid such cases,
thus imposing that selected points should "keep some distance". This
modification is sufficient to prove useful upper-bounds, along with
approximation guarantees for the algorithm.
Related papers
- Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - Optimal Extended Neighbourhood Rule $k$ Nearest Neighbours Ensemble [1.8843687952462742]
A new optimal extended neighborhood rule based ensemble method is proposed in this paper.
The ensemble is compared with state-of-the-art methods on 17 benchmark datasets using accuracy, Cohen's kappa, and Brier score (BS)
arXiv Detail & Related papers (2022-11-21T09:13:54Z) - Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC)
AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability.
We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Improved Search of Relevant Points for Nearest-Neighbor Classification [91.3755431537592]
We say a training point is relevant if its omission from the training set could induce the misclassification of some query point in $mathbbRd$.
An output-sensitive algorithm was proposed to find the set of border points of $P$ in $O( n2 + nk2 )$ time, where $k$ is the size of such set.
In this paper, we improve this algorithm to have time complexity equal to $O( nk2 )$ by proving that the first steps of their algorithm, which require $O( n2
arXiv Detail & Related papers (2022-03-07T18:10:27Z) - Finding Relevant Points for Nearest-Neighbor Classification [0.456877715768796]
In nearest-neighbor classification problems, a set of $d$-dimensional training points are given, each with a known classification, and are used to infer unknown classifications of other points.
A training point is relevant if its omission from the training set would change the outcome of these inferences.
arXiv Detail & Related papers (2021-10-12T16:58:29Z) - Resolving the Optimal Metric Distortion Conjecture [27.344649091365067]
We study the following metric distortion problem.
There are two finite sets of points, $V$ and $C$, that lie in the same metric space.
We propose algorithms that choose a point in $C$ using only these rankings as input.
arXiv Detail & Related papers (2020-04-16T04:13:06Z) - Coresets for the Nearest-Neighbor Rule [78.15296214629433]
Nearest-neighbor condensation deals with finding a subset $R subseteq P$.
This paper introduces the concept of coresets for nearest-neighbor classification.
We propose quadratic-time approximation algorithms with provable upper-bounds on the size of their selected subsets.
arXiv Detail & Related papers (2020-02-16T19:00:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.