Random feature approximation for general spectral methods
- URL: http://arxiv.org/abs/2308.15434v1
- Date: Tue, 29 Aug 2023 16:56:03 GMT
- Title: Random feature approximation for general spectral methods
- Authors: Mike Nguyen and Nicole M\"ucke
- Abstract summary: We analyze generalization properties for a large class of spectral regularization methods combined with random features.
For our estimators we obtain optimal learning rates over gradient regularity classes.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Random feature approximation is arguably one of the most popular techniques
to speed up kernel methods in large scale algorithms and provides a theoretical
approach to the analysis of deep neural networks. We analyze generalization
properties for a large class of spectral regularization methods combined with
random features, containing kernel methods with implicit regularization such as
gradient descent or explicit methods like Tikhonov regularization. For our
estimators we obtain optimal learning rates over regularity classes (even for
classes that are not included in the reproducing kernel Hilbert space), which
are defined through appropriate source conditions. This improves or completes
previous results obtained in related settings for specific kernel algorithms.
Related papers
- Kernel Descent -- a Novel Optimizer for Variational Quantum Algorithms [0.0]
We introduce kernel descent, a novel algorithm for minimizing the functions underlying variational quantum algorithms.
In particular, we showcase scenarios in which kernel descent outperforms gradient descent and quantum analytic descent.
Kernel descent sets itself apart with its employment of reproducing kernel Hilbert space techniques in the construction of the local approximations.
arXiv Detail & Related papers (2024-09-16T13:10:26Z) - Generalization Error Curves for Analytic Spectral Algorithms under Power-law Decay [13.803850290216257]
We rigorously provide a full characterization of the generalization error curves of the kernel gradient descent method.
Thanks to the neural tangent kernel theory, these results greatly improve our understanding of the generalization behavior of training the wide neural networks.
arXiv Detail & Related papers (2024-01-03T08:00:50Z) - On the Sublinear Regret of GP-UCB [58.25014663727544]
We show that the Gaussian Process Upper Confidence Bound (GP-UCB) algorithm enjoys nearly optimal regret rates.
Our improvements rely on a key technical contribution -- regularizing kernel ridge estimators in proportion to the smoothness of the underlying kernel.
arXiv Detail & Related papers (2023-07-14T13:56:11Z) - The Galerkin method beats Graph-Based Approaches for Spectral Algorithms [3.5897534810405403]
We break with the machine learning community and prove the statistical and computational superiority of the Galerkin method.
We introduce implementation tricks to deal with differential operators in large dimensions with structured kernels.
We extend on the core principles beyond our approach to apply them to non-linear spaces of functions, such as the ones parameterized by deep neural networks.
arXiv Detail & Related papers (2023-06-01T14:38:54Z) - Approximation by non-symmetric networks for cross-domain learning [0.0]
We study the approximation capabilities of kernel based networks using non-symmetric kernels.
We obtain estimates on the accuracy of uniform approximation of functions in a Sobolev class by ReLU$r$ networks when $r$ is not necessarily an integer.
arXiv Detail & Related papers (2023-05-06T01:33:26Z) - Hybrid Random Features [60.116392415715275]
We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs)
HRFs automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.
arXiv Detail & Related papers (2021-10-08T20:22:59Z) - Spectrum Gaussian Processes Based On Tunable Basis Functions [15.088239458693003]
We introduce a novel basis function, which is tunable, local and bounded, to approximate the kernel function in the Gaussian process.
We conduct extensive experiments on open-source datasets to testify its performance.
arXiv Detail & Related papers (2021-07-14T03:51:24Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Disentangling the Gauss-Newton Method and Approximate Inference for
Neural Networks [96.87076679064499]
We disentangle the generalized Gauss-Newton and approximate inference for Bayesian deep learning.
We find that the Gauss-Newton method simplifies the underlying probabilistic model significantly.
The connection to Gaussian processes enables new function-space inference algorithms.
arXiv Detail & Related papers (2020-07-21T17:42:58Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.