A PAC-Bayesian Analysis of Distance-Based Classifiers: Why
Nearest-Neighbour works!
- URL: http://arxiv.org/abs/2109.13889v1
- Date: Tue, 28 Sep 2021 17:35:57 GMT
- Title: A PAC-Bayesian Analysis of Distance-Based Classifiers: Why
Nearest-Neighbour works!
- Authors: Thore Graepel and Ralf Herbrich
- Abstract summary: PAC-Bayesian bounds for the generalisation error of the K-nearest-neighbour classifier (K-NN)
We establish a relation between prior measures over the coefficients in the kernel expansion and the induced measure on the weight vectors in kernel space.
- Score: 12.317405551932195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Abstract We present PAC-Bayesian bounds for the generalisation error of the
K-nearest-neighbour classifier (K-NN). This is achieved by casting the K-NN
classifier into a kernel space framework in the limit of vanishing kernel
bandwidth. We establish a relation between prior measures over the coefficients
in the kernel expansion and the induced measure on the weight vectors in kernel
space. Defining a sparse prior over the coefficients allows the application of
a PAC-Bayesian folk theorem that leads to a generalisation bound that is a
function of the number of redundant training examples: those that can be left
out without changing the solution. The presented bound requires to quantify a
prior belief in the sparseness of the solution and is evaluated after learning
when the actual redundancy level is known. Even for small sample size (m ~ 100)
the bound gives non-trivial results when both the expected sparseness and the
actual redundancy are high.
Related papers
- Effective regions and kernels in continuous sparse regularisation, with application to sketched mixtures [12.242935230563834]
This paper advances the theory of continuous sparse regularisation on measures with the Beurling-LASSO (BLASSO)<n>We show that the BLASSO localisation error around the true support decreases with the noise level, leading to effective near regions.
arXiv Detail & Related papers (2025-07-11T09:35:37Z) - Stochastic Approximation with Unbounded Markovian Noise: A General-Purpose Theorem [7.443139252028032]
We consider average-reward Reinforcement Learning with unbounded state space and reward function.
Recent works studied this problem in the actor-critic framework.
We study Temporal Difference (TD) learning with linear function approximation.
arXiv Detail & Related papers (2024-10-29T03:40:53Z) - Understanding the Generalization Ability of Deep Learning Algorithms: A
Kernelized Renyi's Entropy Perspective [11.255943520955764]
We propose a novel information theoretical measure: kernelized Renyi's entropy.
We establish the generalization error bounds for gradient/Langevin descent (SGD/SGLD) learning algorithms under kernelized Renyi's entropy.
We show that our information-theoretical bounds depend on the statistics of the gradients, and are rigorously tighter than the current state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2023-05-02T01:17:15Z) - A New Family of Generalization Bounds Using Samplewise Evaluated CMI [14.147617330278662]
We present a new family of information-theoretic generalization bounds, in which the training loss and the population loss are compared through a jointly convex function.
We demonstrate the generality of this framework by extending previously known information-theoretic bounds.
In some scenarios, this novel bound results in a tighter characterization of the population loss of deep neural networks than previous bounds.
arXiv Detail & Related papers (2022-10-12T17:15:44Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning.
Recent work has established lower bounds against the class of low-degree methods.
We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Oversampling Divide-and-conquer for Response-skewed Kernel Ridge
Regression [20.00435452480056]
We develop a novel response-adaptive partition strategy to overcome the limitation of the divide-and-conquer method.
We show the proposed estimate has a smaller mean squared error (AMSE) than that of the classical dacKRR estimate under mild conditions.
arXiv Detail & Related papers (2021-07-13T04:01:04Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z) - Robust subgaussian estimation with VC-dimension [0.0]
This work proposes a new general way to bound the excess risk for MOM estimators.
The core technique is the use of VC-dimension (instead of Rademacher complexity) to measure the statistical complexity.
arXiv Detail & Related papers (2020-04-24T13:21:09Z) - Kernel-Based Reinforcement Learning: A Finite-Time Analysis [53.47210316424326]
We introduce Kernel-UCBVI, a model-based optimistic algorithm that leverages the smoothness of the MDP and a non-parametric kernel estimator of the rewards.
We empirically validate our approach in continuous MDPs with sparse rewards.
arXiv Detail & Related papers (2020-04-12T12:23:46Z) - Best Arm Identification for Cascading Bandits in the Fixed Confidence
Setting [81.70513857417106]
We design and analyze CascadeBAI, an algorithm for finding the best set of $K$ items.
An upper bound on the time complexity of CascadeBAI is derived by overcoming a crucial analytical challenge.
We show, through the derivation of a lower bound on the time complexity, that the performance of CascadeBAI is optimal in some practical regimes.
arXiv Detail & Related papers (2020-01-23T16:47:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.