Benefit of Interpolation in Nearest Neighbor Algorithms
- URL: http://arxiv.org/abs/2202.11817v1
- Date: Wed, 23 Feb 2022 22:47:18 GMT
- Title: Benefit of Interpolation in Nearest Neighbor Algorithms
- Authors: Yue Xing, Qifan Song, Guang Cheng
- Abstract summary: In some studies, it is observed that over-parametrized deep neural networks achieve a small testing error even when the training error is almost zero.
We turn into another way to enforce zero training error (without over-parametrization) through a data mechanism.
- Score: 21.79888306754263
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In some studies \citep[e.g.,][]{zhang2016understanding} of deep learning, it
is observed that over-parametrized deep neural networks achieve a small testing
error even when the training error is almost zero. Despite numerous works
towards understanding this so-called "double descent" phenomenon
\citep[e.g.,][]{belkin2018reconciling,belkin2019two}, in this paper, we turn
into another way to enforce zero training error (without over-parametrization)
through a data interpolation mechanism. Specifically, we consider a class of
interpolated weighting schemes in the nearest neighbors (NN) algorithms. By
carefully characterizing the multiplicative constant in the statistical risk,
we reveal a U-shaped performance curve for the level of data interpolation in
both classification and regression setups. This sharpens the existing result
\citep{belkin2018does} that zero training error does not necessarily jeopardize
predictive performances and claims a counter-intuitive result that a mild
degree of data interpolation actually {\em strictly} improve the prediction
performance and statistical stability over those of the (un-interpolated)
$k$-NN algorithm. In the end, the universality of our results, such as change
of distance measure and corrupted testing data, will also be discussed.
Related papers
- Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent [57.64826450787237]
We show how to analyze the behavior of distributed gradient descent algorithms in the presence of adversarial corruptions.
We show how to use ideas from (lazy) mirror descent to design a corruption-tolerant distributed optimization algorithm.
Experiments based on linear regression, support vector classification, and softmax classification on the MNIST dataset corroborate our theoretical findings.
arXiv Detail & Related papers (2024-07-19T08:29:12Z) - Deep learning from strongly mixing observations: Sparse-penalized regularization and minimax optimality [0.0]
We consider sparse-penalized regularization for deep neural network predictor.
We deal with the squared and a broad class of loss functions.
arXiv Detail & Related papers (2024-06-12T15:21:51Z) - A new approach to generalisation error of machine learning algorithms:
Estimates and convergence [0.0]
We introduce a new approach to the estimation of the (generalisation) error and to convergence.
Our results include estimates of the error without any structural assumption on the neural networks.
arXiv Detail & Related papers (2023-06-23T20:57:31Z) - Robustification of Online Graph Exploration Methods [59.50307752165016]
We study a learning-augmented variant of the classical, notoriously hard online graph exploration problem.
We propose an algorithm that naturally integrates predictions into the well-known Nearest Neighbor (NN) algorithm.
arXiv Detail & Related papers (2021-12-10T10:02:31Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Predict then Interpolate: A Simple Algorithm to Learn Stable Classifiers [59.06169363181417]
Predict then Interpolate (PI) is an algorithm for learning correlations that are stable across environments.
We prove that by interpolating the distributions of the correct predictions and the wrong predictions, we can uncover an oracle distribution where the unstable correlation vanishes.
arXiv Detail & Related papers (2021-05-26T15:37:48Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.