Related papers: Practical and Fast Momentum-Based Power Methods

Practical and Fast Momentum-Based Power Methods

URL: http://arxiv.org/abs/2108.09264v1
Date: Fri, 20 Aug 2021 16:54:12 GMT
Title: Practical and Fast Momentum-Based Power Methods
Authors: Tahseen Rabbani, Apollo Jain, Arjun Rajkumar, Furong Huang
Abstract summary: The power method is a classical algorithm with broad applications in machine learning tasks, including streaming PCA, spectral clustering, and low-rank matrix approximation. A momentum-based scheme can be used to accelerate the power method, but achieving an optimal convergence rate with existing algorithms critically relies on additional spectral information that is unavailable at run-time. We provide a pair of novel momentum-based power methods, which we call the delayed momentum power method (DMPower) and a streaming variant, the delayed momentum streaming method (DMStream) Our methods leverage inexact deflation and are capable of achieving near-optimal convergence with far
Score: 7.223413715110717
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The power method is a classical algorithm with broad applications in machine learning tasks, including streaming PCA, spectral clustering, and low-rank matrix approximation. The distilled purpose of the vanilla power method is to determine the largest eigenvalue (in absolute modulus) and its eigenvector of a matrix. A momentum-based scheme can be used to accelerate the power method, but achieving an optimal convergence rate with existing algorithms critically relies on additional spectral information that is unavailable at run-time, and sub-optimal initializations can result in divergence. In this paper, we provide a pair of novel momentum-based power methods, which we call the delayed momentum power method (DMPower) and a streaming variant, the delayed momentum streaming method (DMStream). Our methods leverage inexact deflation and are capable of achieving near-optimal convergence with far less restrictive hyperparameter requirements. We provide convergence analyses for both algorithms through the lens of perturbation theory. Further, we experimentally demonstrate that DMPower routinely outperforms the vanilla power method and that both algorithms match the convergence speed of an oracle running existing accelerated methods with perfect spectral knowledge.

Related papers

Robust Quantum Control for Bragg Pulse Design in Atom Interferometry [0.0]
We formulate a robust optimal control algorithm to synthesize minimum energy pulses that can transfer a cold atom system into various momentum states. The algorithm is applied to optimize the atomic beam operation in ultra-cold atom interferometry.
arXiv Detail & Related papers (2025-02-07T02:29:27Z)
A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration [19.83835152405735]
We propose a momentum-celerated distributed gradient, termed Exact-Diffusion with Momentum (EDM) EDM mitigates the bias from data heterogeneity and incorporates momentum techniques commonly used in deep learning. Our theoretical analysis demonstrates that the EDM algorithm converges sublinearly to the neighborhood optimal solution.
arXiv Detail & Related papers (2025-01-31T12:15:58Z)
Difference vs. Quotient: A Novel Algorithm for Dominant Eigenvalue Problem [6.938695246282628]
This paper introduces a novel perspective by reformulating the eigenvalue problem using an unconstrained Difference formulation. Within this family, we propose the Split-Merge algorithm, which achieves maximal acceleration without spectral prior knowledge.
arXiv Detail & Related papers (2025-01-25T08:37:17Z)
Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment [81.84950252537618]
This paper reveals a unified game-theoretic connection between iterative BOND and self-play alignment. We establish a novel framework, WIN rate Dominance (WIND), with a series of efficient algorithms for regularized win rate dominance optimization.
arXiv Detail & Related papers (2024-10-28T04:47:39Z)
The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs) It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques. Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z)
Tensor networks based quantum optimization algorithm [0.0]
In optimization, one of the well-known classical algorithms is power iterations. We propose a quantum realiziation to circumvent this pitfall. Our methodology becomes instance agnostic and thus allows one to address black-box optimization within the framework of quantum computing.
arXiv Detail & Related papers (2024-04-23T13:49:11Z)
Sampling with Mollified Interaction Energy Descent [57.00583139477843]
We present a new optimization-based method for sampling called mollified interaction energy descent (MIED) MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs) We show experimentally that for unconstrained sampling problems our algorithm performs on par with existing particle-based algorithms like SVGD.
arXiv Detail & Related papers (2022-10-24T16:54:18Z)
An Adaptive Gradient Method with Energy and Momentum [0.0]
We introduce a novel algorithm for gradient-based optimization of objective functions. The method is simple to implement, computationally efficient, and well suited for large-scale machine learning problems.
arXiv Detail & Related papers (2022-03-23T04:48:38Z)
Adiabatic Spectroscopy and a Variational Quantum Adiabatic Algorithm [0.7734726150561088]
We propose a method to obtain information about the spectral profile of the adiabatic evolution. We present the concept of a variational quantum adiabatic algorithm (VQAA) for optimized adiabatic paths.
arXiv Detail & Related papers (2021-03-01T19:00:00Z)
SMG: A Shuffling Gradient-Based Method with Momentum [25.389545522794172]
We combine two advanced ideas widely used in optimization for machine learning. We develop a novel shuffling-based momentum technique. Our tests have shown encouraging performance of the new algorithms.
arXiv Detail & Related papers (2020-11-24T04:12:35Z)
Adaptive pruning-based optimization of parameterized quantum circuits [62.997667081978825]
Variisy hybrid quantum-classical algorithms are powerful tools to maximize the use of Noisy Intermediate Scale Quantum devices. We propose a strategy for such ansatze used in variational quantum algorithms, which we call "Efficient Circuit Training" (PECT) Instead of optimizing all of the ansatz parameters at once, PECT launches a sequence of variational algorithms.
arXiv Detail & Related papers (2020-10-01T18:14:11Z)
Learning Centric Power Allocation for Edge Intelligence [84.16832516799289]
Edge intelligence has been proposed, which collects distributed data and performs machine learning at the edge. This paper proposes a learning centric power allocation (LCPA) method, which allocates radio resources based on an empirical classification error model. Experimental results show that the proposed LCPA algorithm significantly outperforms other power allocation algorithms.
arXiv Detail & Related papers (2020-07-21T07:02:07Z)
Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics. Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis. We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)
IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex. Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method. When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.