Related papers: Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization

Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization

URL: http://arxiv.org/abs/2008.01296v3
Date: Mon, 10 Aug 2020 03:20:17 GMT
Title: Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization
Authors: Feihu Huang, Songcan Chen, Heng Huang
Abstract summary: In this paper, we propose a faster alternating direction of multipliers (ADMM) for non-integrated optimization by using a new path, called SPADMM. We prove that the SPADMM achieves a-breaking first-order differential oracle estimator (IFO) for finding a record of an IFO. Our theoretical analysis shows that the online SPIDER-ADMM has the IFOFO(epsilon) by a factor of $mathcalO(n1)$.
Score: 110.52708815647613
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose a faster stochastic alternating direction method of multipliers (ADMM) for nonconvex optimization by using a new stochastic path-integrated differential estimator (SPIDER), called as SPIDER-ADMM. Moreover, we prove that the SPIDER-ADMM achieves a record-breaking incremental first-order oracle (IFO) complexity of $\mathcal{O}(n+n^{1/2}\epsilon^{-1})$ for finding an $\epsilon$-approximate stationary point, which improves the deterministic ADMM by a factor $\mathcal{O}(n^{1/2})$, where $n$ denotes the sample size. As one of major contribution of this paper, we provide a new theoretical analysis framework for nonconvex stochastic ADMM methods with providing the optimal IFO complexity. Based on this new analysis framework, we study the unsolved optimal IFO complexity of the existing non-convex SVRG-ADMM and SAGA-ADMM methods, and prove they have the optimal IFO complexity of $\mathcal{O}(n+n^{2/3}\epsilon^{-1})$. Thus, the SPIDER-ADMM improves the existing stochastic ADMM methods by a factor of $\mathcal{O}(n^{1/6})$. Moreover, we extend SPIDER-ADMM to the online setting, and propose a faster online SPIDER-ADMM. Our theoretical analysis shows that the online SPIDER-ADMM has the IFO complexity of $\mathcal{O}(\epsilon^{-\frac{3}{2}})$, which improves the existing best results by a factor of $\mathcal{O}(\epsilon^{-\frac{1}{2}})$. Finally, the experimental results on benchmark datasets validate that the proposed algorithms have faster convergence rate than the existing ADMM algorithms for nonconvex optimization.

Related papers

Moreau Envelope ADMM for Decentralized Weakly Convex Optimization [55.2289666758254]
This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization. The results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.
arXiv Detail & Related papers (2023-08-31T14:16:30Z)
Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems. We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z)
A Convergent ADMM Framework for Efficient Neural Network Training [17.764095204676973]
Alternating Direction Method of Multipliers (ADMM) has achieved tremendous success in many classification and regression applications. We propose a novel framework to solve a general neural network training problem via ADMM (dlADMM) to address these challenges simultaneously. Experiments on seven benchmark datasets demonstrate the convergence, efficiency, and effectiveness of our proposed dlADMM algorithm.
arXiv Detail & Related papers (2021-12-22T01:55:24Z)
BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications. We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z)
Geom-SPIDER-EM: Faster Variance Reduced Stochastic Expectation Maximization for Nonconvex Finite-Sum Optimization [21.81837334970773]
We propose an extension of the Path-Integrated Differential Estima to the Expectation Maximization (EM) algorithm. We show it supports the same state art bounds as SPIDER-EM-IDER; and results provide for a rate for our findings.
arXiv Detail & Related papers (2020-11-24T21:20:53Z)
Differentially Private ADMM Algorithms for Machine Learning [38.648113004535155]
We study efficient differentially private alternating direction methods of multipliers (ADMM) via gradient perturbation. We propose the first differentially private ADMM (DP-ADMM) algorithm with performance guarantee of $(epsilon,delta)$-differential privacy.
arXiv Detail & Related papers (2020-10-31T01:37:24Z)
Alternating Direction Method of Multipliers for Quantization [15.62692130672419]
We study the performance of the Alternating Direction Method of Multipliers for Quantization ($texttADMM-Q$) algorithm. We develop a few variants of $texttADMM-Q$ that can handle inexact update rules. We empirically evaluate the efficacy of our proposed approaches.
arXiv Detail & Related papers (2020-09-08T01:58:02Z)
Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters [152.03852111442114]
Although model-agnostic metalearning (MAML) is a very successful algorithm meta-learning practice, it can have high computational complexity. Our paper shows that such complexity can significantly affect the overall convergence performance of ANIL.
arXiv Detail & Related papers (2020-06-16T19:57:48Z)
Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity [109.54166127479093]
Zeroth-order (a.k.a, derivative-free) methods are a class of effective optimization methods for solving machine learning problems. In this paper, we propose a class faster faster zerothorder alternating gradient method multipliers (MMADMM) to solve the non finitesum problems. We show that ZOMMAD methods can achieve a lower function $O(frac13nfrac1)$ for finding an $epsilon$-stationary point. At the same time, we propose a class faster zerothorder online ADM methods (M
arXiv Detail & Related papers (2019-07-30T02:21:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.