Manifold Proximal Point Algorithms for Dual Principal Component Pursuit
and Orthogonal Dictionary Learning
- URL: http://arxiv.org/abs/2005.02356v2
- Date: Wed, 21 Jul 2021 13:40:06 GMT
- Title: Manifold Proximal Point Algorithms for Dual Principal Component Pursuit
and Orthogonal Dictionary Learning
- Authors: Shixiang Chen, Zengde Deng, Shiqian Ma, Anthony Man-Cho So
- Abstract summary: We consider the problem of maximizing a linear map over the sphere, which arises in various machine learning applications.
We propose a new approach to a problem in which the sphere is replaced by the Stiefel matrix.
- Score: 32.87704663543739
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of maximizing the $\ell_1$ norm of a linear map over
the sphere, which arises in various machine learning applications such as
orthogonal dictionary learning (ODL) and robust subspace recovery (RSR). The
problem is numerically challenging due to its nonsmooth objective and nonconvex
constraint, and its algorithmic aspects have not been well explored. In this
paper, we show how the manifold structure of the sphere can be exploited to
design fast algorithms for tackling this problem. Specifically, our
contribution is threefold. First, we present a manifold proximal point
algorithm (ManPPA) for the problem and show that it converges at a sublinear
rate. Furthermore, we show that ManPPA can achieve a quadratic convergence rate
when applied to the ODL and RSR problems. Second, we propose a stochastic
variant of ManPPA called StManPPA, which is well suited for large-scale
computation, and establish its sublinear convergence rate. Both ManPPA and
StManPPA have provably faster convergence rates than existing subgradient-type
methods. Third, using ManPPA as a building block, we propose a new approach to
solving a matrix analog of the problem, in which the sphere is replaced by the
Stiefel manifold. The results from our extensive numerical experiments on the
ODL and RSR problems demonstrate the efficiency and efficacy of our proposed
methods.
Related papers
- Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - A Deep Learning algorithm to accelerate Algebraic Multigrid methods in
Finite Element solvers of 3D elliptic PDEs [0.0]
We introduce a novel Deep Learning algorithm that minimizes the computational cost of the Algebraic multigrid method when used as a finite element solver.
We experimentally prove that the pooling successfully reduces the computational cost of processing a large sparse matrix and preserves the features needed for the regression task at hand.
arXiv Detail & Related papers (2023-04-21T09:18:56Z) - Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints [9.301728976515255]
This article provides new practical and theoretical developments for the landing algorithm.
First, the method is extended to the Stiefel manifold.
We also consider variance reduction algorithms when the cost function is an average of many functions.
arXiv Detail & Related papers (2023-03-29T07:36:54Z) - Decentralized Riemannian natural gradient methods with Kronecker-product
approximations [11.263837420265594]
We present an efficient decentralized natural gradient descent (DRNGD) method for solving decentralized manifold optimization problems.
By performing the communications over the Kronecker factors, a high-quality approximation of the RFIM can be obtained in a low cost.
arXiv Detail & Related papers (2023-03-16T19:36:31Z) - Two-Stage Stochastic Optimization via Primal-Dual Decomposition and Deep
Unrolling [86.85697555068168]
Two-stage algorithmic optimization plays a critical role in various engineering and scientific applications.
There still lack efficient algorithms, especially when the long-term and short-term variables are coupled in the constraints.
We show that PDD-SSCA can achieve superior performance over existing solutions.
arXiv Detail & Related papers (2021-05-05T03:36:00Z) - Converting ADMM to a Proximal Gradient for Convex Optimization Problems [4.56877715768796]
In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem.
This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that the constraints and objectives are strongly convex.
We show by numerical experiments that we can obtain a significant improvement in terms of efficiency.
arXiv Detail & Related papers (2021-04-22T07:41:12Z) - Hybrid Trilinear and Bilinear Programming for Aligning Partially
Overlapping Point Sets [85.71360365315128]
In many applications, we need algorithms which can align partially overlapping point sets are invariant to the corresponding corresponding RPM algorithm.
We first show that the objective is a cubic bound function. We then utilize the convex envelopes of trilinear and bilinear monomial transformations to derive its lower bound.
We next develop a branch-and-bound (BnB) algorithm which only branches over the transformation variables and runs efficiently.
arXiv Detail & Related papers (2021-01-19T04:24:23Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - An Online Method for A Class of Distributionally Robust Optimization
with Non-Convex Objectives [54.29001037565384]
We propose a practical online method for solving a class of online distributionally robust optimization (DRO) problems.
Our studies demonstrate important applications in machine learning for improving the robustness of networks.
arXiv Detail & Related papers (2020-06-17T20:19:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.