Error Bounds for Learning with Vector-Valued Random Features
- URL: http://arxiv.org/abs/2305.17170v2
- Date: Wed, 15 Nov 2023 00:52:03 GMT
- Title: Error Bounds for Learning with Vector-Valued Random Features
- Authors: Samuel Lanthaler, Nicholas H. Nelsen
- Abstract summary: This paper provides a comprehensive error analysis of learning with vector-valued random features (RF)
The theory is developed for RF ridge regression in a fully general infinite-dimensional input-output setting.
- Score: 2.375038919274297
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper provides a comprehensive error analysis of learning with
vector-valued random features (RF). The theory is developed for RF ridge
regression in a fully general infinite-dimensional input-output setting, but
nonetheless applies to and improves existing finite-dimensional analyses. In
contrast to comparable work in the literature, the approach proposed here
relies on a direct analysis of the underlying risk functional and completely
avoids the explicit RF ridge regression solution formula in terms of random
matrices. This removes the need for concentration results in random matrix
theory or their generalizations to random operators. The main results
established in this paper include strong consistency of vector-valued RF
estimators under model misspecification and minimax optimal convergence rates
in the well-specified setting. The parameter complexity (number of random
features) and sample complexity (number of labeled data) required to achieve
such rates are comparable with Monte Carlo intuition and free from logarithmic
factors.
Related papers
- Dimension-free deterministic equivalents and scaling laws for random feature regression [11.607594737176973]
We show that the test error is well approximated by a closed-form expression that only depends on the feature map eigenvalues.
Notably, our approximation guarantee is non-asymptotic, multiplicative, and independent of the feature map dimension.
arXiv Detail & Related papers (2024-05-24T16:43:26Z) - A general error analysis for randomized low-rank approximation with application to data assimilation [42.57210316104905]
We propose a framework for the analysis of the low-rank approximation error in Frobenius norm for centered and non-standard matrices.
Under minimal assumptions, we derive accurate bounds in expectation and probability.
Our bounds have clear interpretations that enable us to derive properties and motivate practical choices.
arXiv Detail & Related papers (2024-05-08T04:51:56Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Convex Parameter Estimation of Perturbed Multivariate Generalized
Gaussian Distributions [18.95928707619676]
We propose a convex formulation with well-established properties for MGGD parameters.
The proposed framework is flexible as it combines a variety of regularizations for the precision matrix, the mean and perturbations.
Experiments show a more accurate precision and covariance matrix estimation with similar performance for the mean vector parameter.
arXiv Detail & Related papers (2023-12-12T18:08:04Z) - Simplex Random Features [53.97976744884616]
We present Simplex Random Features (SimRFs), a new random feature (RF) mechanism for unbiased approximation of the softmax and Gaussian kernels.
We prove that SimRFs provide the smallest possible mean square error (MSE) on unbiased estimates of these kernels.
We show consistent gains provided by SimRFs in settings including pointwise kernel estimation, nonparametric classification and scalable Transformers.
arXiv Detail & Related papers (2023-01-31T18:53:39Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.