Related papers: Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem

Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem

URL: http://arxiv.org/abs/2008.02856v2
Date: Fri, 6 Aug 2021 21:42:01 GMT
Title: Iterative Pre-Conditioning for Expediting the Gradient-Descent Method: The Distributed Linear Least-Squares Problem
Authors: Kushal Chakrabarti, Nirupam Gupta, Nikhil Chopra
Abstract summary: This paper considers the multi-agent linear least-squares problem in a server-agent network. The goal for the agents is to compute a linear model that optimally fits the collective data points held by all the agents, without sharing their individual local data points. We propose an iterative pre-conditioning technique that mitigates the deleterious effect of the conditioning of data points on the rate of convergence of the gradient-descent method.
Score: 0.966840768820136
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper considers the multi-agent linear least-squares problem in a server-agent network. In this problem, the system comprises multiple agents, each having a set of local data points, that are connected to a server. The goal for the agents is to compute a linear mathematical model that optimally fits the collective data points held by all the agents, without sharing their individual local data points. This goal can be achieved, in principle, using the server-agent variant of the traditional iterative gradient-descent method. The gradient-descent method converges linearly to a solution, and its rate of convergence is lower bounded by the conditioning of the agents' collective data points. If the data points are ill-conditioned, the gradient-descent method may require a large number of iterations to converge. We propose an iterative pre-conditioning technique that mitigates the deleterious effect of the conditioning of data points on the rate of convergence of the gradient-descent method. We rigorously show that the resulting pre-conditioned gradient-descent method, with the proposed iterative pre-conditioning, achieves superlinear convergence when the least-squares problem has a unique solution. In general, the convergence is linear with improved rate of convergence in comparison to the traditional gradient-descent method and the state-of-the-art accelerated gradient-descent methods. We further illustrate the improved rate of convergence of our proposed algorithm through experiments on different real-world least-squares problems in both noise-free and noisy computation environment.

Related papers

Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness [51.302674884611335]
This work introduces a hybrid non-Euclidean optimization method which generalizes norm clipping by combining steepest descent and conditional gradient approaches.<n>We discuss how to instantiate the algorithms for deep learning and demonstrate their properties on image classification and language modeling.
arXiv Detail & Related papers (2025-06-02T17:34:29Z)
Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates [50.36933471975506]
We consider the finite-sum optimization problem, where each component function is strongly convex and has Lipschitz continuous gradient and Hessian. The recently proposed incremental quasi-Newton method is based on BFGS update and achieves a local superlinear convergence rate. This paper proposes a more efficient quasi-Newton method by incorporating the symmetric rank-1 update into the incremental framework.
arXiv Detail & Related papers (2024-02-04T05:54:51Z)
Over-the-Air Computation Aided Federated Learning with the Aggregation of Normalized Gradient [12.692064367193934]
Over-the-air computation is a communication-efficient solution for federated learning (FL) In such a system, iterative procedure of private loss function is updated, and then transmitted by every mobile device. To circumvent this problem, we propose to turn local gradient to be normalized one before amplifying it.
arXiv Detail & Related papers (2023-08-17T16:15:47Z)
Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z)
On Asymptotic Linear Convergence of Projected Gradient Descent for Constrained Least Squares [22.851500417035947]
This manuscript presents a unified framework for the analysis of projected gradient descent in the context of constrained least squares. We present a recipe for the convergence analysis of PGD and demonstrate it via a beginning-to-end application of the recipe on four fundamental problems.
arXiv Detail & Related papers (2021-12-22T09:49:51Z)
On Accelerating Distributed Convex Optimizations [0.0]
This paper studies a distributed multi-agent convex optimization problem. We show that the proposed algorithm converges linearly with an improved rate of convergence than the traditional and adaptive gradient-descent methods. We demonstrate our algorithm's superior performance compared to prominent distributed algorithms for solving real logistic regression problems.
arXiv Detail & Related papers (2021-08-19T13:19:54Z)
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence. We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z)
Accelerating Distributed SGD for Linear Regression using Iterative Pre-Conditioning [0.966840768820136]
Iteratively Pre-conditioned Gradient-descent (IPSG) method shown to converge faster than other existing distributed algorithms. IPSG method's convergence rate compares favorably to prominent algorithms for solving the linear least-squares problem in server-based networks.
arXiv Detail & Related papers (2020-11-15T18:09:13Z)
Random extrapolation for primal-dual coordinate descent [61.55967255151027]
We introduce a randomly extrapolated primal-dual coordinate descent method that adapts to sparsity of the data matrix and the favorable structures of the objective function. We show almost sure convergence of the sequence and optimal sublinear convergence rates for the primal-dual gap and objective values, in the general convex-concave case.
arXiv Detail & Related papers (2020-07-13T17:39:35Z)
Conditional gradient methods for stochastically constrained convex minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems. The most important feature of our framework is that only a subset of the constraints is processed at each iteration. Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
Iterative Pre-Conditioning to Expedite the Gradient-Descent Method [0.966840768820136]
This paper considers the problem of multi-agent distributed optimization. We propose an iterative pre-conditioning approach that can significantly attenuate the influence of the problem's conditioning on the convergence-speed of the gradient-descent method.
arXiv Detail & Related papers (2020-03-13T16:30:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.