Improved Convergence Speed of Fully Symmetric Learning Rules for
Principal Component Analysis
- URL: http://arxiv.org/abs/2007.09426v1
- Date: Sat, 18 Jul 2020 13:41:35 GMT
- Title: Improved Convergence Speed of Fully Symmetric Learning Rules for
Principal Component Analysis
- Authors: Ralf M\"oller
- Abstract summary: We describe a modified objective function with an additional term which mitigates this convergence problem.
We show that the learning rule derived from the modified objective function inherits all fixed points from the original learning rule.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fully symmetric learning rules for principal component analysis can be
derived from a novel objective function suggested in our previous work. We
observed that these learning rules suffer from slow convergence for covariance
matrices where some principal eigenvalues are close to each other. Here we
describe a modified objective function with an additional term which mitigates
this convergence problem. We show that the learning rule derived from the
modified objective function inherits all fixed points from the original
learning rule (but may introduce additional ones). Also the stability of the
inherited fixed points remains unchanged. Only the steepness of the objective
function is increased in some directions. Simulations confirm that the
convergence speed can be noticeably improved, depending on the weight factor of
the additional term.
Related papers
- Is All Learning (Natural) Gradient Descent? [1.3654846342364308]
We show that a class of effective learning rules can be as natural gradient descent with respect to a suitably defined loss function and metric.
We also demonstrate that these metrics have a canonical form and identify several optimal ones, including the metric that achieves the minimum possible condition number.
arXiv Detail & Related papers (2024-09-24T19:41:08Z) - Good regularity creates large learning rate implicit biases: edge of
stability, balancing, and catapult [49.8719617899285]
Large learning rates, when applied to objective descent for non optimization, yield various implicit biases including the edge of stability.
This paper provides an initial step in descent and shows that these implicit biases are in fact various tips same iceberg.
arXiv Detail & Related papers (2023-10-26T01:11:17Z) - Over-Parameterization Exponentially Slows Down Gradient Descent for
Learning a Single Neuron [49.45105570960104]
We prove the global convergence of randomly gradient descent with a $Oleft(T-3right)$ rate.
These two bounds jointly give an exact characterization of the convergence rate.
We show this potential function converges slowly, which implies the slow convergence rate of the loss function.
arXiv Detail & Related papers (2023-02-20T15:33:26Z) - The Geometry of Neural Nets' Parameter Spaces Under Reparametrization [35.5848464226014]
We study the invariance of neural nets under reparametrization from the perspective of Riemannian geometry.
We discuss implications for measuring the flatness of minima, optimization, and for probability-density.
arXiv Detail & Related papers (2023-02-14T22:48:24Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Matrix Completion via Non-Convex Relaxation and Adaptive Correlation
Learning [90.8576971748142]
We develop a novel surrogate that can be optimized by closed-form solutions.
We exploit upperwise correlation for completion, and thus an adaptive correlation learning model.
arXiv Detail & Related papers (2022-03-04T08:50:50Z) - Proximal Gradient Descent-Ascent: Variable Convergence under K{\L}
Geometry [49.65455534654459]
The finite descent-ascent parameters (GDA) has been widely applied to solve minimax optimization problems.
This paper fills such a gap by studying the convergence of the KL-Lized geometry.
arXiv Detail & Related papers (2021-02-09T05:35:53Z) - A Study of Condition Numbers for First-Order Optimization [12.072067586666382]
We introduce a class of perturbations quantified via a new norm, called *-norm.
We show that smoothness and strong convexity can be heavily impacted by arbitrarily small perturbations.
We propose a notion of continuity of the metrics, which is essential for a robust tuning strategy.
arXiv Detail & Related papers (2020-12-10T16:17:48Z) - Linear Last-iterate Convergence in Constrained Saddle-point Optimization [48.44657553192801]
We significantly expand the understanding of last-rate uniqueness for Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU)
We show that when the equilibrium is unique, linear lastiterate convergence is achieved with a learning rate whose value is set to a universal constant.
We show that bilinear games over any polytope satisfy this condition and OGDA converges exponentially fast even without the unique equilibrium assumption.
arXiv Detail & Related papers (2020-06-16T20:53:04Z) - Derivation of Symmetric PCA Learning Rules from a Novel Objective
Function [0.0]
Neural learning rules for principal component / subspace analysis can be derived by maximizing an objective function.
For a subspace with a single axis, the optimization produces the principal eigenvector of the data covariance matrix.
For a subspace with multiple axes, the optimization leads to PSA learning rules which only converge to axes spanning the principal subspace but not to the principal eigenvectors.
arXiv Detail & Related papers (2020-05-24T08:57:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.