Related papers: Converting ADMM to a Proximal Gradient for Convex Optimization Problems

Converting ADMM to a Proximal Gradient for Convex Optimization Problems

URL: http://arxiv.org/abs/2104.10911v1
Date: Thu, 22 Apr 2021 07:41:12 GMT
Title: Converting ADMM to a Proximal Gradient for Convex Optimization Problems
Authors: Ryosuke Shimmura and Joe Suzuki
Abstract summary: In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem. This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that the constraints and objectives are strongly convex. We show by numerical experiments that we can obtain a significant improvement in terms of efficiency.
Score: 4.56877715768796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In machine learning and data science, we often consider efficiency for solving problems. In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem. It takes time to include matrix division in the former case, while an efficient method such as FISTA (fast iterative shrinkage-thresholding algorithm) has been developed in the latter case. This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that the constraints and objectives are strongly convex. Then, we apply it to sparse estimation problems, such as sparse convex clustering and trend filtering, and we show by numerical experiments that we can obtain a significant improvement in terms of efficiency.

Related papers

Efficient algorithms for the Hadamard decomposition [14.653207365119796]
The Hadamard decomposition is a powerful technique for data analysis and matrix compression. In this paper, we develop a framework that decomposes a given matrix into the product of two or more low-rank approximations. We conduct experiments to compare our method with the existing descent-based approaches for the Hadamard decomposition.
arXiv Detail & Related papers (2025-04-18T11:14:25Z)
Regularized Projection Matrix Approximation with Applications to Community Detection [1.3761665705201904]
This paper introduces a regularized projection matrix approximation framework designed to recover cluster information from the affinity matrix. We investigate three distinct penalty functions, each specifically tailored to address bounded, positive, and sparse scenarios. Numerical experiments conducted on both synthetic and real-world datasets reveal that our regularized projection matrix approximation approach significantly outperforms state-of-the-art methods in clustering performance.
arXiv Detail & Related papers (2024-05-26T15:18:22Z)
Moreau Envelope ADMM for Decentralized Weakly Convex Optimization [55.2289666758254]
This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization. The results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.
arXiv Detail & Related papers (2023-08-31T14:16:30Z)
Neural incomplete factorization: learning preconditioners for the conjugate gradient method [2.899792823251184]
We develop a data-driven approach to accelerate the generation of effective preconditioners. We replace the typically hand-engineered preconditioners by the output of graph neural networks. Our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF)
arXiv Detail & Related papers (2023-05-25T11:45:46Z)
Nystrom Method for Accurate and Scalable Implicit Differentiation [25.29277451838466]
We show that the Nystrom method consistently achieves comparable or even superior performance to other approaches. The proposed method avoids numerical instability and can be efficiently computed in matrix operations without iterations.
arXiv Detail & Related papers (2023-02-20T02:37:26Z)
Computational Complexity of Sub-linear Convergent Algorithms [0.0]
We will explore how starting with a small sample and then geometrically increasing it, and using the solution of the previous sample to compute the new ERM. This will solve problems with first-order optimization algorithms of sublinear convergence but with lower computational complexity.
arXiv Detail & Related papers (2022-09-29T05:38:06Z)
The Curse of Unrolling: Rate of Differentiating Through Optimization [35.327233435055305]
Un differentiation approximates the solution using an iterative solver and differentiates it through the computational path. We show that we can either 1) choose a large learning rate leading to a fast convergence but accept that the algorithm may have an arbitrarily long burn-in phase or 2) choose a smaller learning rate leading to an immediate but slower convergence.
arXiv Detail & Related papers (2022-09-27T09:27:29Z)
Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
Alternating Direction Method of Multipliers for Quantization [15.62692130672419]
We study the performance of the Alternating Direction Method of Multipliers for Quantization ($texttADMM-Q$) algorithm. We develop a few variants of $texttADMM-Q$ that can handle inexact update rules. We empirically evaluate the efficacy of our proposed approaches.
arXiv Detail & Related papers (2020-09-08T01:58:02Z)
IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex. Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method. When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.