Related papers: Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning

Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning

URL: http://arxiv.org/abs/2302.09738v9
Date: Sat, 16 Dec 2023 00:32:20 GMT
Title: Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning
Authors: Wu Lin, Valentin Duruisseaux, Melvin Leok, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt
Abstract summary: We show how to solve difficult differential equations with momentum on a submanifold. We do so by proposing a generalized version of the Riemannian normal coordinates. We use our approach to simplify existing approaches for structured covariances and develop matrix-inverse-free $2textnd$orders for deep learning with low precision by using only matrix multiplications.
Score: 24.97120654216651
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations. Here, we simplify such difficulties for a class of sparse or structured symmetric positive-definite matrices with the affine-invariant metric. We do so by proposing a generalized version of the Riemannian normal coordinates that dynamically orthonormalizes the metric and locally converts the problem into an unconstrained problem in the Euclidean space. We use our approach to simplify existing approaches for structured covariances and develop matrix-inverse-free $2^\text{nd}$-order optimizers for deep learning with low precision by using only matrix multiplications. Code: https://github.com/yorkerlin/StructuredNGD-DL

Related papers

Inertial Quadratic Majorization Minimization with Application to Kernel Regularized Learning [1.0282274843007797]
We introduce the Quadratic Majorization Minimization with Extrapolation (QMME) framework and establish its sequential convergence properties.<n>To demonstrate practical advantages, we apply QMME to large-scale kernel regularized learning problems.
arXiv Detail & Related papers (2025-07-06T05:17:28Z)
Structured Regularization for Constrained Optimization on the SPD Manifold [1.1126342180866644]
We introduce a class of structured regularizers, based on symmetric gauge functions, which allow for solving constrained optimization on the SPD manifold with faster unconstrained methods. We show that our structured regularizers can be chosen to preserve or induce desirable structure, in particular convexity and "difference of convex" structure.
arXiv Detail & Related papers (2024-10-12T22:11:22Z)
Low-complexity subspace-descent over symmetric positive definite manifold [9.346050098365648]
We develop low-complexity algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold. The proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix.
arXiv Detail & Related papers (2023-05-03T11:11:46Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
Splitting numerical integration for matrix completion [0.0]
We propose a new algorithm for low rank matrix approximation. The algorithm is an adaptation of classical gradient descent within the framework of optimization. Experimental results show that our approach has good scalability for large-scale problems.
arXiv Detail & Related papers (2022-02-14T04:45:20Z)
Learning a Compressive Sensing Matrix with Structural Constraints via Maximum Mean Discrepancy Optimization [17.104994036477308]
We introduce a learning-based algorithm to obtain a measurement matrix for compressive sensing related recovery problems. Recent success of such metrics in neural network related topics motivate a solution of the problem based on machine learning.
arXiv Detail & Related papers (2021-10-14T08:35:54Z)
Nonlinear matrix recovery using optimization on the Grassmann manifold [18.655422834567577]
We investigate the problem of recovering a partially observed high-rank clustering matrix whose columns obey a nonlinear structure such as a union of subspaces. We show that the alternating limit converges to a unique point using the Kurdyka-Lojasi property.
arXiv Detail & Related papers (2021-09-13T16:13:13Z)
Optimization on manifolds: A symplectic approach [127.54402681305629]
We propose a dissipative extension of Dirac's theory of constrained Hamiltonian systems as a general framework for solving optimization problems. Our class of (accelerated) algorithms are not only simple and efficient but also applicable to a broad range of contexts.
arXiv Detail & Related papers (2021-07-23T13:43:34Z)
Solving weakly supervised regression problem using low-rank manifold regularization [77.34726150561087]
We solve a weakly supervised regression problem. Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources. In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z)
Automatic differentiation for Riemannian optimization on low-rank matrix and tensor-train manifolds [71.94111815357064]
In scientific computing and machine learning applications, matrices and more general multidimensional arrays (tensors) can often be approximated with the help of low-rank decompositions. One of the popular tools for finding the low-rank approximations is to use the Riemannian optimization.
arXiv Detail & Related papers (2021-03-27T19:56:00Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.