Gradient flow on extensive-rank positive semi-definite matrix denoising
- URL: http://arxiv.org/abs/2303.09474v1
- Date: Thu, 16 Mar 2023 16:50:46 GMT
- Title: Gradient flow on extensive-rank positive semi-definite matrix denoising
- Authors: Antoine Bodin and Nicolas Macris
- Abstract summary: We present a new approach to analyze the gradient flow for a positive semi-definite matrix denoising problem in an extensive-rank and high-dimensional regime.
We derive fixed point equations which track the complete time evolution of the matrix-mean-square-error of the problem.
The predictions of the resulting fixed point equations are validated by numerical experiments.
- Score: 15.720219436063797
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we present a new approach to analyze the gradient flow for a
positive semi-definite matrix denoising problem in an extensive-rank and
high-dimensional regime. We use recent linear pencil techniques of random
matrix theory to derive fixed point equations which track the complete time
evolution of the matrix-mean-square-error of the problem. The predictions of
the resulting fixed point equations are validated by numerical experiments. In
this short note we briefly illustrate a few predictions of our formalism by way
of examples, and in particular we uncover continuous phase transitions in the
extensive-rank and high-dimensional regime, which connect to the classical
phase transitions of the low-rank problem in the appropriate limit. The
formalism has much wider applicability than shown in this communication.
Related papers
- Quantum Algorithms for Nonlinear Dynamics: Revisiting Carleman Linearization with No Dissipative Conditions [0.7373617024876725]
We explore the embedding of nonlinear dynamical systems into linear ordinary differential equations (ODEs) via the Carleman linearization method.
Our analysis extends these findings by exploring error bounds beyond the traditional dissipative condition.
We prove how this resonance condition leads to a linear convergence with respect to the truncation level $N$ in Carleman linearization.
arXiv Detail & Related papers (2024-05-21T12:09:34Z) - Solving quantum impurity problems on the L-shaped Kadanoff-Baym contour [0.0]
We extend the recently developed Grassmann time-evolving matrix product operator (GTEMPO) method to solve quantum impurity problems directly on the Kadanoff-Baym contour.
The accuracy of this method is numerically demonstrated against exact solutions in the noninteracting case, and against existing calculations on the real- and imaginary-time axes.
arXiv Detail & Related papers (2024-04-08T11:21:06Z) - Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - The Inductive Bias of Flatness Regularization for Deep Matrix
Factorization [58.851514333119255]
This work takes the first step toward understanding the inductive bias of the minimum trace of the Hessian solutions in deep linear networks.
We show that for all depth greater than one, with the standard Isometry Property (RIP) on the measurements, minimizing the trace of Hessian is approximately equivalent to minimizing the Schatten 1-norm of the corresponding end-to-end matrix parameters.
arXiv Detail & Related papers (2023-06-22T23:14:57Z) - Implicit Balancing and Regularization: Generalization and Convergence
Guarantees for Overparameterized Asymmetric Matrix Sensing [28.77440901439686]
A series of recent papers have begun to generalize this role for non-random Positive Semi-Defin (PSD) matrix sensing problems.
In this paper, we show that the trajectory of the gradient descent from small random measurements moves towards solutions that are both globally well.
arXiv Detail & Related papers (2023-03-24T19:05:52Z) - Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations.
DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z) - Understanding Incremental Learning of Gradient Descent: A Fine-grained
Analysis of Matrix Sensing [74.2952487120137]
It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in machine learning models.
This paper provides a fine-grained analysis of the dynamics of GD for the matrix sensing problem.
arXiv Detail & Related papers (2023-01-27T02:30:51Z) - The Neural Network shifted-Proper Orthogonal Decomposition: a Machine
Learning Approach for Non-linear Reduction of Hyperbolic Equations [0.0]
In this work we approach the problem of automatically detecting the correct pre-processing transformation in a statistical learning framework.
The purely data-driven method allowed us to generalise the existing approaches of linear subspace manipulation to non-linear hyperbolic problems with unknown advection fields.
The proposed algorithm has been validated against simple test cases to benchmark its performances and later successfully applied to a multiphase simulation.
arXiv Detail & Related papers (2021-08-14T15:13:35Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - A spectral algorithm for robust regression with subgaussian rates [0.0]
We study a new linear up to quadratic time algorithm for linear regression in the absence of strong assumptions on the underlying distributions of samples.
The goal is to design a procedure which attains the optimal sub-gaussian error bound even though the data have only finite moments.
arXiv Detail & Related papers (2020-07-12T19:33:50Z) - Conditional gradient methods for stochastically constrained convex
minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems.
The most important feature of our framework is that only a subset of the constraints is processed at each iteration.
Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.