Distributed Stochastic Nonconvex Optimization and Learning based on
Successive Convex Approximation
- URL: http://arxiv.org/abs/2004.14882v2
- Date: Tue, 12 May 2020 08:08:03 GMT
- Title: Distributed Stochastic Nonconvex Optimization and Learning based on
Successive Convex Approximation
- Authors: Paolo Di Lorenzo, Simone Scardapane
- Abstract summary: We introduce a novel framework for the distributed algorithmic minimization of the sum of the sum of the agents in a network.
We show that the proposed method can be applied to distributed neural networks.
- Score: 26.11677569331688
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study distributed stochastic nonconvex optimization in multi-agent
networks. We introduce a novel algorithmic framework for the distributed
minimization of the sum of the expected value of a smooth (possibly nonconvex)
function (the agents' sum-utility) plus a convex (possibly nonsmooth)
regularizer. The proposed method hinges on successive convex approximation
(SCA) techniques, leveraging dynamic consensus as a mechanism to track the
average gradient among the agents, and recursive averaging to recover the
expected gradient of the sum-utility function. Almost sure convergence to
(stationary) solutions of the nonconvex problem is established. Finally, the
method is applied to distributed stochastic training of neural networks.
Numerical results confirm the theoretical claims, and illustrate the advantages
of the proposed method with respect to other methods available in the
literature.
Related papers
- A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - Variance Reduction and Low Sample Complexity in Stochastic Optimization
via Proximal Point Method [5.025654873456757]
The paper establishes a low sample complexity to obtain a high probability guarantee on the convergence of the proposed method.
A subroutine is developed to solve the proximal subproblem, which also serves as a novel technique for variance reduction.
arXiv Detail & Related papers (2024-02-14T07:34:22Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Robust empirical risk minimization via Newton's method [9.797319790710711]
A new variant of Newton's method for empirical risk minimization is studied.
The gradient and Hessian of the objective function are replaced by robust estimators.
An algorithm for obtaining robust Newton directions based on the conjugate gradient method is also proposed.
arXiv Detail & Related papers (2023-01-30T18:54:54Z) - A Unified Convergence Theorem for Stochastic Optimization Methods [4.94128206910124]
We provide a fundamental unified convergence theorem used for deriving convergence results for a series of unified optimization methods.
As a direct application, we recover almost sure convergence results under general settings.
arXiv Detail & Related papers (2022-06-08T14:01:42Z) - A Semismooth Newton Stochastic Proximal Point Algorithm with Variance Reduction [2.048226951354646]
We develop an implementable proximal point (SPP) method for a class of weakly convex, composite optimization problems.
The proposed algorithm incorporates a variance reduction mechanism and the resulting updates are solved using an inexact semismooth Newton framework.
arXiv Detail & Related papers (2022-04-01T13:08:49Z) - A Stochastic Newton Algorithm for Distributed Convex Optimization [62.20732134991661]
We analyze a Newton algorithm for homogeneous distributed convex optimization, where each machine can calculate gradients of the same population objective.
We show that our method can reduce the number, and frequency, of required communication rounds compared to existing methods without hurting performance.
arXiv Detail & Related papers (2021-10-07T17:51:10Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - On Stochastic Moving-Average Estimators for Non-Convex Optimization [105.22760323075008]
In this paper, we demonstrate the power of a widely used estimator based on moving average (SEMA) problems.
For all these-the-art results, we also present the results for all these-the-art problems.
arXiv Detail & Related papers (2021-04-30T08:50:24Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.