Parameter Estimation for Generalized Low-Rank Matrix Sensing by Learning on Riemannian Manifolds
- URL: http://arxiv.org/abs/2407.10238v1
- Date: Sun, 14 Jul 2024 15:11:13 GMT
- Title: Parameter Estimation for Generalized Low-Rank Matrix Sensing by Learning on Riemannian Manifolds
- Authors: Osbert Bastani,
- Abstract summary: We prove convergence guarantees for generalized low-rank matrix sensing.
We focus on local convergence of the optimal estimator, ignoring questions of optimization.
Our analysis relies on tools from Riemannian geometry to handle the rotational symmetry in the parameter space.
- Score: 37.53442095760427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We prove convergence guarantees for generalized low-rank matrix sensing -- i.e., where matrix sensing where the observations may be passed through some nonlinear link function. We focus on local convergence of the optimal estimator, ignoring questions of optimization. In particular, assuming the minimizer of the empirical loss $\theta^0$ is in a constant size ball around the true parameters $\theta^*$, we prove that $d(\theta^0,\theta^*)=\tilde{O}(\sqrt{dk^2/n})$. Our analysis relies on tools from Riemannian geometry to handle the rotational symmetry in the parameter space.
Related papers
- Convergence of Gradient Descent with Small Initialization for
Unregularized Matrix Completion [21.846732043706318]
We show that the vanilla gradient descent provably converges to the ground truth $rmXstar$ without requiring any explicit regularization.
Surprisingly, neither the convergence rate nor the final accuracy depends on the over- parameterized search rank $r'$, and they are only governed by the true rank $r$.
arXiv Detail & Related papers (2024-02-09T19:39:23Z) - The Sample Complexity Of ERMs In Stochastic Convex Optimization [13.896417716930687]
We show that in fact $tildeO(fracdepsilon+frac1epsilon2)$ data points are also sufficient.
We further generalize the result and show that a similar upper bound holds for all convex bodies.
arXiv Detail & Related papers (2023-11-09T14:29:25Z) - How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing:
The Curses of Symmetry and Initialization [46.55524654398093]
We show how over- parameterization changes the convergence behaviors of descent.
The goal is to recover an unknown low-rank ground-rank ground-truth matrix from near-isotropic linear measurements.
We propose a novel method that only modifies one step of GD and obtains a convergence rate independent of $alpha$.
arXiv Detail & Related papers (2023-10-03T03:34:22Z) - Mirror Natural Evolution Strategies [10.495496415022064]
We focus on the theory of zeroth-order optimization which utilizes both the first-order and second-order information approximated by the zeroth-order queries.
We show that the estimated covariance matrix of textttMiNES converges to the inverse of Hessian matrix of the objective function.
arXiv Detail & Related papers (2023-08-01T11:45:24Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Optimal Measurement of Field Properties with Quantum Sensor Networks [0.0]
We consider a quantum sensor network of qubit sensors coupled to a field $f(vecx;vectheta)$ analytically parameterized by the vector of parameters $vectheta$.
We derive saturable bounds on the precision of measuring an arbitrary analytic function $q(vectheta)$ of these parameters.
arXiv Detail & Related papers (2020-11-02T19:02:28Z) - Support recovery and sup-norm convergence rates for sparse pivotal
estimation [79.13844065776928]
In high dimensional sparse regression, pivotal estimators are estimators for which the optimal regularization parameter is independent of the noise level.
We show minimax sup-norm convergence rates for non smoothed and smoothed, single task and multitask square-root Lasso-type estimators.
arXiv Detail & Related papers (2020-01-15T16:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.