Interpolation and Learning with Scale Dependent Kernels
- URL: http://arxiv.org/abs/2006.09984v3
- Date: Wed, 10 Nov 2021 10:48:53 GMT
- Title: Interpolation and Learning with Scale Dependent Kernels
- Authors: Nicol\`o Pagliana, Alessandro Rudi, Ernesto De Vito, Lorenzo Rosasco
- Abstract summary: We study the learning properties of nonparametric ridge-less least squares.
We consider the common case of estimators defined by scale dependent kernels.
- Score: 91.41836461193488
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the learning properties of nonparametric ridge-less least squares.
In particular, we consider the common case of estimators defined by scale
dependent kernels, and focus on the role of the scale. These estimators
interpolate the data and the scale can be shown to control their stability
through the condition number. Our analysis shows that are different regimes
depending on the interplay between the sample size, its dimensions, and the
smoothness of the problem. Indeed, when the sample size is less than
exponential in the data dimension, then the scale can be chosen so that the
learning error decreases. As the sample size becomes larger, the overall error
stop decreasing but interestingly the scale can be chosen in such a way that
the variance due to noise remains bounded. Our analysis combines, probabilistic
results with a number of analytic techniques from interpolation theory.
Related papers
- Learning causal graphs using variable grouping according to ancestral relationship [7.126300090990439]
When the sample size is small relative to the number of variables, the accuracy of estimating causal graphs using existing methods decreases.
Some methods are not feasible when the sample size is smaller than the number of variables.
To circumvent these problems, some researchers proposed causal structure learning algorithms using divide-and-conquer approaches.
arXiv Detail & Related papers (2024-03-21T04:42:04Z) - Estimation of mutual information via quantum kernel method [0.0]
Estimating mutual information (MI) plays a critical role to investigate the relationship among multiple random variables with a nonlinear correlation.
We propose a method for estimating mutual information using the quantum kernel.
arXiv Detail & Related papers (2023-10-19T00:53:16Z) - Gradient-Based Feature Learning under Structured Data [57.76552698981579]
In the anisotropic setting, the commonly used spherical gradient dynamics may fail to recover the true direction.
We show that appropriate weight normalization that is reminiscent of batch normalization can alleviate this issue.
In particular, under the spiked model with a suitably large spike, the sample complexity of gradient-based training can be made independent of the information exponent.
arXiv Detail & Related papers (2023-09-07T16:55:50Z) - Benign Overfitting in Time Series Linear Model with
Over-Parameterization [5.68558935178946]
We develop a theory for excess risk of the estimator under multiple dependence types.
We show that the convergence rate of risks with short-memory processes is identical to that of cases with independent data.
arXiv Detail & Related papers (2022-04-18T15:26:58Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Multinomial Sampling for Hierarchical Change-Point Detection [0.0]
We propose a multinomial sampling methodology that improves the detection rate and reduces the delay.
Our experiments show results that outperform the baseline method and we also provide an example oriented to a human behavior study.
arXiv Detail & Related papers (2020-07-24T09:18:17Z) - An Investigation of Why Overparameterization Exacerbates Spurious
Correlations [98.3066727301239]
We identify two key properties of the training data that drive this behavior.
We show how the inductive bias of models towards "memorizing" fewer examples can cause over parameterization to hurt.
arXiv Detail & Related papers (2020-05-09T01:59:13Z) - Compressing Large Sample Data for Discriminant Analysis [78.12073412066698]
We consider the computational issues due to large sample size within the discriminant analysis framework.
We propose a new compression approach for reducing the number of training samples for linear and quadratic discriminant analysis.
arXiv Detail & Related papers (2020-05-08T05:09:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.