Extrapolation Towards Imaginary $0$-Nearest Neighbour and Its Improved
Convergence Rate
- URL: http://arxiv.org/abs/2002.03054v2
- Date: Wed, 11 Nov 2020 01:14:14 GMT
- Title: Extrapolation Towards Imaginary $0$-Nearest Neighbour and Its Improved
Convergence Rate
- Authors: Akifumi Okuno, Hidetoshi Shimodaira
- Abstract summary: We propose a novel multiscale $k$-NN (MS-$k$-NN) that extrapolates unweighted $k$-NN estimators from several $k ge 1$ values to $k=0$.
Our method implicitly computes optimal real-valued weights that are adaptive to the query and its neighbour points.
- Score: 13.985534521589257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: $k$-nearest neighbour ($k$-NN) is one of the simplest and most widely-used
methods for supervised classification, that predicts a query's label by taking
weighted ratio of observed labels of $k$ objects nearest to the query. The
weights and the parameter $k \in \mathbb{N}$ regulate its bias-variance
trade-off, and the trade-off implicitly affects the convergence rate of the
excess risk for the $k$-NN classifier; several existing studies considered
selecting optimal $k$ and weights to obtain faster convergence rate. Whereas
$k$-NN with non-negative weights has been developed widely, it was also proved
that negative weights are essential for eradicating the bias terms and
attaining optimal convergence rate. In this paper, we propose a novel
multiscale $k$-NN (MS-$k$-NN), that extrapolates unweighted $k$-NN estimators
from several $k \ge 1$ values to $k=0$, thus giving an imaginary 0-NN
estimator. Our method implicitly computes optimal real-valued weights that are
adaptive to the query and its neighbour points. We theoretically prove that the
MS-$k$-NN attains the improved rate, which coincides with the existing optimal
rate under some conditions.
Related papers
- Convergence Rate Analysis of LION [54.28350823319057]
LION converges iterations of $cal(sqrtdK-)$ measured by gradient Karush-Kuhn-T (sqrtdK-)$.
We show that LION can achieve lower loss and higher performance compared to standard SGD.
arXiv Detail & Related papers (2024-11-12T11:30:53Z) - Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making [58.06306331390586]
We introduce the notion of a margin complement, which measures how much a prediction score $S$ changes due to a thresholding operation.
We show that under suitable causal assumptions, the influences of $X$ on the prediction score $S$ are equal to the influences of $X$ on the true outcome $Y$.
arXiv Detail & Related papers (2024-05-24T11:22:19Z) - Nearest Neighbor Sampling for Covariate Shift Adaptation [7.940293148084844]
We propose a new covariate shift adaptation method which avoids estimating the weights.
The basic idea is to directly work on unlabeled target data, labeled according to the $k$-nearest neighbors in the source dataset.
Our experiments show that it achieves drastic reduction in the running time with remarkable accuracy.
arXiv Detail & Related papers (2023-12-15T17:28:09Z) - Minimax Optimal Algorithms with Fixed-$k$-Nearest Neighbors [13.231906521852718]
We consider a distributed learning scenario in which a massive dataset is split into smaller groups.
We propose emphoptimal rules to aggregate the fixed-$k$-NN information for classification, regression, and density estimation.
We show that the distributed algorithm with a fixed $k$ over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor.
arXiv Detail & Related papers (2022-02-05T01:59:09Z) - Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed
Rewards [24.983866845065926]
We consider multi-armed bandits with heavy-tailed rewards, whose $p$-th moment is bounded by a constant $nu_p$ for $1pleq2$.
We propose a novel robust estimator which does not require $nu_p$ as prior information.
We show that an error probability of the proposed estimator decays exponentially fast.
arXiv Detail & Related papers (2020-10-24T10:44:02Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.