Stochastic Modified Equations for Continuous Limit of Stochastic ADMM
- URL: http://arxiv.org/abs/2003.03532v1
- Date: Sat, 7 Mar 2020 08:01:50 GMT
- Title: Stochastic Modified Equations for Continuous Limit of Stochastic ADMM
- Authors: Xiang Zhou, Huizhuo Yuan, Chris Junchi Li, Qingyun Sun
- Abstract summary: We put different variants of ADMM into a unified form, which includes standard, linearized and gradient-based ADMM with relaxation, and study their dynamics via a continuous-time model approach.
We show that the dynamics of ADMM is approximated by a class of differential equations with small noise parameters in the sense of weak approximation.
- Score: 13.694172299830315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic version of alternating direction method of multiplier (ADMM) and
its variants (linearized ADMM, gradient-based ADMM) plays a key role for modern
large scale machine learning problems. One example is the regularized empirical
risk minimization problem. In this work, we put different variants of
stochastic ADMM into a unified form, which includes standard, linearized and
gradient-based ADMM with relaxation, and study their dynamics via a
continuous-time model approach. We adapt the mathematical framework of
stochastic modified equation (SME), and show that the dynamics of stochastic
ADMM is approximated by a class of stochastic differential equations with small
noise parameters in the sense of weak approximation. The continuous-time
analysis would uncover important analytical insights into the behaviors of the
discrete-time algorithm, which are non-trivial to gain otherwise. For example,
we could characterize the fluctuation of the solution paths precisely, and
decide optimal stopping time to minimize the variance of solution paths.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - A General Continuous-Time Formulation of Stochastic ADMM and Its Variants [5.269633789700637]
We introduce a unified algorithmic framework called generalized ADMM.
Our continuous-time analysis provides us with new insights into ADMM and variants.
We prove that under some proper scaling, the trajectory of ADMM weakly converges to the solution of a differential equation with small noise.
arXiv Detail & Related papers (2024-04-22T17:12:58Z) - Optimizing ADMM and Over-Relaxed ADMM Parameters for Linear Quadratic
Problems [32.04687753889809]
Alternating Direction Method of Multipliers (ADMM) has gained significant attention across a broad spectrum of machine learning applications.
We propose a general approach to optimize the value of penalty parameter, followed by a novel closed-form formula to compute the optimal relaxation parameter.
We then experimentally validate our parameter selection methods through random instantiations and diverse imaging applications.
arXiv Detail & Related papers (2024-01-01T04:01:40Z) - Robust scalable initialization for Bayesian variational inference with
multi-modal Laplace approximations [0.0]
Variational mixtures with full-covariance structures suffer from a quadratic growth due to variational parameters with the number of parameters.
We propose a method for constructing an initial Gaussian model approximation that can be used to warm-start variational inference.
arXiv Detail & Related papers (2023-07-12T19:30:04Z) - Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient
Methods [73.35353358543507]
Gradient Descent-Ascent (SGDA) is one of the most prominent algorithms for solving min-max optimization and variational inequalities problems (VIP)
In this paper, we propose a unified convergence analysis that covers a large variety of descent-ascent methods.
We develop several new variants of SGDA such as a new variance-reduced method (L-SVRGDA), new distributed methods with compression (QSGDA, DIANA-SGDA, VR-DIANA-SGDA), and a new method with coordinate randomization (SEGA-SGDA)
arXiv Detail & Related papers (2022-02-15T09:17:39Z) - Time varying regression with hidden linear dynamics [74.9914602730208]
We revisit a model for time-varying linear regression that assumes the unknown parameters evolve according to a linear dynamical system.
Counterintuitively, we show that when the underlying dynamics are stable the parameters of this model can be estimated from data by combining just two ordinary least squares estimates.
arXiv Detail & Related papers (2021-12-29T23:37:06Z) - Moment evolution equations and moment matching for stochastic image
EPDiff [68.97335984455059]
Models of image deformation allow study of time-continuous effects transforming images by deforming the image domain.
Applications include medical image analysis with both population trends and random subject specific variation.
We use moment approximations of the corresponding Ito diffusion to construct estimators for statistical inference in the parameters full model.
arXiv Detail & Related papers (2021-10-07T11:08:11Z) - Variational Inference for Continuous-Time Switching Dynamical Systems [29.984955043675157]
We present a model based on an Markov jump process modulating a subordinated diffusion process.
We develop a new continuous-time variational inference algorithm.
We extensively evaluate our algorithm under the model assumption and for real-world examples.
arXiv Detail & Related papers (2021-09-29T15:19:51Z) - A Framework of Inertial Alternating Direction Method of Multipliers for
Non-Convex Non-Smooth Optimization [17.553531291690025]
We propose an algorithmic framework dubbed alternating methods of multipliers (iADMM) for solving a class of non nonsmooth multiblock composite problems.
Our framework employs the general-major surrogateization (MM) principle to update each block of variables to unify the convergence analysis of previous ADMM schemes.
arXiv Detail & Related papers (2021-02-10T13:55:28Z) - The Variational Method of Moments [65.91730154730905]
conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables.
Motivated by a variational minimax reformulation of OWGMM, we define a very general class of estimators for the conditional moment problem.
We provide algorithms for valid statistical inference based on the same kind of variational reformulations.
arXiv Detail & Related papers (2020-12-17T07:21:06Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.