Related papers: Iterative Pre-Conditioning to Expedite the Gradient-Descent Method

Iterative Pre-Conditioning to Expedite the Gradient-Descent Method

URL: http://arxiv.org/abs/2003.07180v2
Date: Sun, 29 Mar 2020 04:04:42 GMT
Title: Iterative Pre-Conditioning to Expedite the Gradient-Descent Method
Authors: Kushal Chakrabarti, Nirupam Gupta and Nikhil Chopra
Abstract summary: This paper considers the problem of multi-agent distributed optimization. We propose an iterative pre-conditioning approach that can significantly attenuate the influence of the problem's conditioning on the convergence-speed of the gradient-descent method.
Score: 0.966840768820136
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper considers the problem of multi-agent distributed optimization. In this problem, there are multiple agents in the system, and each agent only knows its local cost function. The objective for the agents is to collectively compute a common minimum of the aggregate of all their local cost functions. In principle, this problem is solvable using a distributed variant of the traditional gradient-descent method, which is an iterative method. However, the speed of convergence of the traditional gradient-descent method is highly influenced by the conditioning of the optimization problem being solved. Specifically, the method requires a large number of iterations to converge to a solution if the optimization problem is ill-conditioned. In this paper, we propose an iterative pre-conditioning approach that can significantly attenuate the influence of the problem's conditioning on the convergence-speed of the gradient-descent method. The proposed pre-conditioning approach can be easily implemented in distributed systems and has minimal computation and communication overhead. For now, we only consider a specific distributed optimization problem wherein the individual local cost functions of the agents are quadratic. Besides the theoretical guarantees, the improved convergence speed of our approach is demonstrated through experiments on a real data-set.

Related papers

Over-the-Air Computation Aided Federated Learning with the Aggregation of Normalized Gradient [12.692064367193934]
Over-the-air computation is a communication-efficient solution for federated learning (FL) In such a system, iterative procedure of private loss function is updated, and then transmitted by every mobile device. To circumvent this problem, we propose to turn local gradient to be normalized one before amplifying it.
arXiv Detail & Related papers (2023-08-17T16:15:47Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
The Curse of Unrolling: Rate of Differentiating Through Optimization [35.327233435055305]
Un differentiation approximates the solution using an iterative solver and differentiates it through the computational path. We show that we can either 1) choose a large learning rate leading to a fast convergence but accept that the algorithm may have an arbitrarily long burn-in phase or 2) choose a smaller learning rate leading to an immediate but slower convergence.
arXiv Detail & Related papers (2022-09-27T09:27:29Z)
A framework for bilevel optimization that enables stochastic and global variance reduction algorithms [17.12280360174073]
Bilevel optimization is a problem of minimizing a value function which involves the arg-minimum of another function. We introduce a novel framework, in which the solution of the inner problem, the solution of the linear system, and the main variable evolve at the same time. We demonstrate that SABA, an adaptation of the celebrated SAGA algorithm in our framework, has $O(frac1T)$ convergence rate, and that it achieves linear convergence under Polyak-Lojasciewicz assumption.
arXiv Detail & Related papers (2022-01-31T18:17:25Z)
On Accelerating Distributed Convex Optimizations [0.0]
This paper studies a distributed multi-agent convex optimization problem. We show that the proposed algorithm converges linearly with an improved rate of convergence than the traditional and adaptive gradient-descent methods. We demonstrate our algorithm's superior performance compared to prominent distributed algorithms for solving real logistic regression problems.
arXiv Detail & Related papers (2021-08-19T13:19:54Z)
Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z)
Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients. We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function. We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z)
Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem [0.966840768820136]
This paper considers the multi-agent linear least-squares problem in a server-agent network. The goal for the agents is to compute a linear model that optimally fits the collective data points held by all the agents, without sharing their individual local data points. We propose an iterative pre-conditioning technique that mitigates the deleterious effect of the conditioning of data points on the rate of convergence of the gradient-descent method.
arXiv Detail & Related papers (2020-08-06T20:01:18Z)
A Multi-Agent Primal-Dual Strategy for Composite Optimization over Distributed Features [52.856801164425086]
We study multi-agent sharing optimization problems with the objective function being the sum of smooth local functions plus a convex (possibly non-smooth) coupling function.
arXiv Detail & Related papers (2020-06-15T19:40:24Z)
IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex. Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method. When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.