A Distributed Cubic-Regularized Newton Method for Smooth Convex
Optimization over Networks
- URL: http://arxiv.org/abs/2007.03562v1
- Date: Tue, 7 Jul 2020 15:38:47 GMT
- Title: A Distributed Cubic-Regularized Newton Method for Smooth Convex
Optimization over Networks
- Authors: C\'esar A. Uribe and Ali Jadbabaie
- Abstract summary: We propose a distributed, cubic-regularized Newton method for large-scale convex optimization over networks.
We show a $O(k-3)$ convergence rate when the cost function is convex with Lipschitz gradient and Hessian, with $k$ being the number of iterations.
- Score: 19.767938436958296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a distributed, cubic-regularized Newton method for large-scale
convex optimization over networks. The proposed method requires only local
computations and communications and is suitable for federated learning
applications over arbitrary network topologies. We show a $O(k^{{-}3})$
convergence rate when the cost function is convex with Lipschitz gradient and
Hessian, with $k$ being the number of iterations. We further provide
network-dependent bounds for the communication required in each step of the
algorithm. We provide numerical experiments that validate our theoretical
results.
Related papers
- Verification of Geometric Robustness of Neural Networks via Piecewise Linear Approximation and Lipschitz Optimisation [57.10353686244835]
We address the problem of verifying neural networks against geometric transformations of the input image, including rotation, scaling, shearing, and translation.
The proposed method computes provably sound piecewise linear constraints for the pixel values by using sampling and linear approximations in combination with branch-and-bound Lipschitz.
We show that our proposed implementation resolves up to 32% more verification cases than present approaches.
arXiv Detail & Related papers (2024-08-23T15:02:09Z) - The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - Network-GIANT: Fully distributed Newton-type optimization via harmonic
Hessian consensus [2.8617826964327113]
We introduce a Newton-type fully distributed optimization algorithm, Network-GIANT, based on GIANT.
We prove that our algorithm guarantees semi-global and exponential convergence to the exact solution over the network assuming strongly convex and smooth loss functions.
We provide empirical evidence of the superior convergence performance of Network-GIANT over other state-of-art distributed learning algorithms such as Network-DANE and Newton-Raphson Consensus.
arXiv Detail & Related papers (2023-05-13T11:42:40Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Decentralized Gossip-Based Stochastic Bilevel Optimization over
Communication Networks [42.76623191830371]
We propose a gossip-based distributed bilevel optimization algorithm.
Agents can solve both networked and outer problems in a single time.
Our algorithm achieves the state-of-the-art efficiency and test accuracy.
arXiv Detail & Related papers (2022-06-22T06:38:54Z) - Distributed stochastic proximal algorithm with random reshuffling for
non-smooth finite-sum optimization [28.862321453597918]
Non-smooth finite-sum minimization is a fundamental problem in machine learning.
This paper develops a distributed proximal-gradient algorithm with random reshuffling to solve the problem.
arXiv Detail & Related papers (2021-11-06T07:29:55Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - q-RBFNN:A Quantum Calculus-based RBF Neural Network [31.14412266444568]
A gradient descent based learning approach for the radial basis function neural networks (RBFNN) is proposed.
The proposed method is based on the q-gradient which is also known as Jackson derivative.
The proposed $q$-RBFNN is analyzed for its convergence performance in the context of least square algorithm.
arXiv Detail & Related papers (2021-06-02T08:27:12Z) - Convex Optimization with an Interpolation-based Projection and its
Application to Deep Learning [36.19092177858517]
We investigate whether an inexact, but cheaper projection can drive a descent algorithm to an optimum.
Specifically, we propose an inexact-based projection that is computationally cheap and easy to compute given a convex, domain defining, function.
arXiv Detail & Related papers (2020-11-13T16:52:50Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Lipschitz constant estimation of Neural Networks via sparse polynomial
optimization [47.596834444042685]
LiPopt is a framework for computing increasingly tighter upper bounds on the Lipschitz constant of neural networks.
We show how to use the sparse connectivity of a network, to significantly reduce the complexity.
We conduct experiments on networks with random weights as well as networks trained on MNIST.
arXiv Detail & Related papers (2020-04-18T18:55:02Z) - Channel Assignment in Uplink Wireless Communication using Machine
Learning Approach [54.012791474906514]
This letter investigates a channel assignment problem in uplink wireless communication systems.
Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints.
Due to high computational complexity, machine learning approaches are employed to obtain computational efficient solutions.
arXiv Detail & Related papers (2020-01-12T15:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.