Related papers: Random feature approximation for general spectral methods

Random feature approximation for general spectral methods

URL: http://arxiv.org/abs/2308.15434v1
Date: Tue, 29 Aug 2023 16:56:03 GMT
Title: Random feature approximation for general spectral methods
Authors: Mike Nguyen and Nicole M\"ucke
Abstract summary: We analyze generalization properties for a large class of spectral regularization methods combined with random features. For our estimators we obtain optimal learning rates over gradient regularity classes.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Random feature approximation is arguably one of the most popular techniques to speed up kernel methods in large scale algorithms and provides a theoretical approach to the analysis of deep neural networks. We analyze generalization properties for a large class of spectral regularization methods combined with random features, containing kernel methods with implicit regularization such as gradient descent or explicit methods like Tikhonov regularization. For our estimators we obtain optimal learning rates over regularity classes (even for classes that are not included in the reproducing kernel Hilbert space), which are defined through appropriate source conditions. This improves or completes previous results obtained in related settings for specific kernel algorithms.

Related papers

Random feature approximation for general spectral methods [2.9388890036358104]
This work extends previous results for Tikhonov regularization to a broad class of spectral regularization techniques.<n>We enable a theoretical analysis of neural networks and neural operators through the lens of the Neural Tangent Kernel (NTK) approach.
arXiv Detail & Related papers (2025-06-19T13:00:17Z)
Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel [55.82768375605861]
We establish a generalization bound for gradient flow that aligns with the classical Rademacher complexity for kernel methods.<n>Unlike static kernels such as NTK, the LPK captures the entire training trajectory, adapting to both data and optimization dynamics.
arXiv Detail & Related papers (2025-06-12T23:17:09Z)
Kernel Descent -- a Novel Optimizer for Variational Quantum Algorithms [0.0]
We introduce kernel descent, a novel algorithm for minimizing the functions underlying variational quantum algorithms. In particular, we showcase scenarios in which kernel descent outperforms gradient descent and quantum analytic descent. Kernel descent sets itself apart with its employment of reproducing kernel Hilbert space techniques in the construction of the local approximations.
arXiv Detail & Related papers (2024-09-16T13:10:26Z)
Generalization Error Curves for Analytic Spectral Algorithms under Power-law Decay [13.803850290216257]
We rigorously provide a full characterization of the generalization error curves of the kernel gradient descent method. Thanks to the neural tangent kernel theory, these results greatly improve our understanding of the generalization behavior of training the wide neural networks.
arXiv Detail & Related papers (2024-01-03T08:00:50Z)
On the Sublinear Regret of GP-UCB [58.25014663727544]
We show that the Gaussian Process Upper Confidence Bound (GP-UCB) algorithm enjoys nearly optimal regret rates. Our improvements rely on a key technical contribution -- regularizing kernel ridge estimators in proportion to the smoothness of the underlying kernel.
arXiv Detail & Related papers (2023-07-14T13:56:11Z)
The Galerkin method beats Graph-Based Approaches for Spectral Algorithms [3.5897534810405403]
We break with the machine learning community and prove the statistical and computational superiority of the Galerkin method. We introduce implementation tricks to deal with differential operators in large dimensions with structured kernels. We extend on the core principles beyond our approach to apply them to non-linear spaces of functions, such as the ones parameterized by deep neural networks.
arXiv Detail & Related papers (2023-06-01T14:38:54Z)
Approximation by non-symmetric networks for cross-domain learning [0.0]
We study the approximation capabilities of kernel based networks using non-symmetric kernels. We obtain estimates on the accuracy of uniform approximation of functions in a Sobolev class by ReLU$r$ networks when $r$ is not necessarily an integer.
arXiv Detail & Related papers (2023-05-06T01:33:26Z)
Hybrid Random Features [60.116392415715275]
We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) HRFs automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.
arXiv Detail & Related papers (2021-10-08T20:22:59Z)
Spectrum Gaussian Processes Based On Tunable Basis Functions [15.088239458693003]
We introduce a novel basis function, which is tunable, local and bounded, to approximate the kernel function in the Gaussian process. We conduct extensive experiments on open-source datasets to testify its performance.
arXiv Detail & Related papers (2021-07-14T03:51:24Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Disentangling the Gauss-Newton Method and Approximate Inference for Neural Networks [96.87076679064499]
We disentangle the generalized Gauss-Newton and approximate inference for Bayesian deep learning. We find that the Gauss-Newton method simplifies the underlying probabilistic model significantly. The connection to Gaussian processes enables new function-space inference algorithms.
arXiv Detail & Related papers (2020-07-21T17:42:58Z)
Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms. Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.