Related papers: Social Distancing is Good for Points too!

Social Distancing is Good for Points too!

URL: http://arxiv.org/abs/2006.15650v1
Date: Sun, 28 Jun 2020 16:49:59 GMT
Title: Social Distancing is Good for Points too!
Authors: Alejandro Flores-Velazco
Abstract summary: We show that FCNN can behave poorly when points are too close to each other. We then successfully modify the algorithm to avoid such cases. This modification is sufficient to prove useful upper-bounds, along with approximation guarantees for the algorithm.
Score: 91.3755431537592
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The nearest-neighbor rule is a well-known classification technique that, given a training set P of labeled points, classifies any unlabeled query point with the label of its closest point in P. The nearest-neighbor condensation problem aims to reduce the training set without harming the accuracy of the nearest-neighbor rule. FCNN is the most popular algorithm for condensation. It is heuristic in nature, and theoretical results for it are scarce. In this paper, we settle the question of whether reasonable upper-bounds can be proven for the size of the subset selected by FCNN. First, we show that the algorithm can behave poorly when points are too close to each other, forcing it to select many more points than necessary. We then successfully modify the algorithm to avoid such cases, thus imposing that selected points should "keep some distance". This modification is sufficient to prove useful upper-bounds, along with approximation guarantees for the algorithm.

Related papers

Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size. Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z)
Optimal Extended Neighbourhood Rule $k$ Nearest Neighbours Ensemble [1.8843687952462742]
A new optimal extended neighborhood rule based ensemble method is proposed in this paper. The ensemble is compared with state-of-the-art methods on 17 benchmark datasets using accuracy, Cohen's kappa, and Brier score (BS)
arXiv Detail & Related papers (2022-11-21T09:13:54Z)
Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC) AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability. We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model. The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z)
Improved Search of Relevant Points for Nearest-Neighbor Classification [91.3755431537592]
We say a training point is relevant if its omission from the training set could induce the misclassification of some query point in $mathbbRd$. An output-sensitive algorithm was proposed to find the set of border points of $P$ in $O( n2 + nk2 )$ time, where $k$ is the size of such set. In this paper, we improve this algorithm to have time complexity equal to $O( nk2 )$ by proving that the first steps of their algorithm, which require $O( n2
arXiv Detail & Related papers (2022-03-07T18:10:27Z)
Finding Relevant Points for Nearest-Neighbor Classification [0.456877715768796]
In nearest-neighbor classification problems, a set of $d$-dimensional training points are given, each with a known classification, and are used to infer unknown classifications of other points. A training point is relevant if its omission from the training set would change the outcome of these inferences.
arXiv Detail & Related papers (2021-10-12T16:58:29Z)
Resolving the Optimal Metric Distortion Conjecture [27.344649091365067]
We study the following metric distortion problem. There are two finite sets of points, $V$ and $C$, that lie in the same metric space. We propose algorithms that choose a point in $C$ using only these rankings as input.
arXiv Detail & Related papers (2020-04-16T04:13:06Z)
Coresets for the Nearest-Neighbor Rule [78.15296214629433]
Nearest-neighbor condensation deals with finding a subset $R subseteq P$. This paper introduces the concept of coresets for nearest-neighbor classification. We propose quadratic-time approximation algorithms with provable upper-bounds on the size of their selected subsets.
arXiv Detail & Related papers (2020-02-16T19:00:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.