A Penalty-Based Method for Communication-Efficient Decentralized Bilevel
Programming
- URL: http://arxiv.org/abs/2211.04088v3
- Date: Fri, 1 Sep 2023 09:37:06 GMT
- Title: A Penalty-Based Method for Communication-Efficient Decentralized Bilevel
Programming
- Authors: Parvin Nazari, Ahmad Mousavi, Davoud Ataee Tarzanagh, and George
Michailidis
- Abstract summary: Bilevel programming has recently received attention in the literature, due to its wide range of applications.
The underlying bilevel optimization problem is solved either by a single machine or in the case of multiple machines connected in a star-shaped network.
This paper introduces a penalty function based decentralized algorithm with theoretical guarantees for this class of optimization problems.
- Score: 14.35928967799696
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Bilevel programming has recently received attention in the literature, due to
its wide range of applications, including reinforcement learning and
hyper-parameter optimization. However, it is widely assumed that the underlying
bilevel optimization problem is solved either by a single machine or in the
case of multiple machines connected in a star-shaped network, i.e., federated
learning setting. The latter approach suffers from a high communication cost on
the central node (e.g., parameter server) and exhibits privacy vulnerabilities.
Hence, it is of interest to develop methods that solve bilevel optimization
problems in a communication-efficient decentralized manner. To that end, this
paper introduces a penalty function based decentralized algorithm with
theoretical guarantees for this class of optimization problems. Specifically, a
distributed alternating gradient-type algorithm for solving consensus bilevel
programming over a decentralized network is developed. A key feature of the
proposed algorithm is to estimate the hyper-gradient of the penalty function
via decentralized computation of matrix-vector products and few vector
communications, which is then integrated within an alternating algorithm to
obtain finite-time convergence analysis under different convexity assumptions.
Our theoretical result highlights improvements in the iteration complexity of
decentralized bilevel optimization, all while making efficient use of vector
communication. Empirical results on both synthetic and real datasets
demonstrate that the proposed method performs well in real-world settings.
Related papers
- A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints [60.91397373464365]
We develop a BLOCC algorithm to tackle BiLevel Optimization problems with Coupled Constraints.
We demonstrate its effectiveness on two well-known real-world applications.
arXiv Detail & Related papers (2024-06-14T15:59:36Z) - Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks [57.24087627267086]
We consider the task of minimizing the sum of convex functions stored in a decentralized manner across the nodes of a communication network.
Lower bounds on the number of decentralized communications and (sub)gradient computations required to solve the problem have been established.
We develop the first optimal algorithm that matches these lower bounds and offers substantially improved theoretical performance compared to the existing state of the art.
arXiv Detail & Related papers (2024-05-28T10:28:45Z) - Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic
Update and Transient Iteration Complexity [38.54552875789929]
We introduce a single-loop decentralized SBO (D-SOBA) algorithm and establish its transient complexity.
D-SOBA achieves the state-of-the-art rate, gradient/Hessian complexity, and transient iteration complexity under more relaxed assumptions.
arXiv Detail & Related papers (2024-02-05T16:35:30Z) - A Single-Loop Algorithm for Decentralized Bilevel Optimization [11.67135350286933]
We propose a novel single-loop algorithm for solving decentralized bilevel optimization with a strongly convex lower-level problem.
Our approach is a fully single-loop method that approximates the hypergradient using only two matrix-vector multiplications per iteration.
Our analysis demonstrates that the proposed algorithm achieves the best-known convergence rate for bilevel optimization algorithms.
arXiv Detail & Related papers (2023-11-15T13:29:49Z) - Decentralized Multi-Level Compositional Optimization Algorithms with Level-Independent Convergence Rate [26.676582181833584]
Decentralized multi-level optimization is challenging because of the multilevel structure and decentralized communication.
We develop two novel decentralized optimization algorithms to optimize the multi-level compositional problem.
arXiv Detail & Related papers (2023-06-06T00:23:28Z) - Communication-Efficient Federated Bilevel Optimization with Local and
Global Lower Level Problems [118.00379425831566]
We propose a communication-efficient algorithm, named FedBiOAcc.
We prove that FedBiOAcc-Local converges at the same rate for this type of problems.
Empirical results show superior performance of our algorithms.
arXiv Detail & Related papers (2023-02-13T21:28:53Z) - INTERACT: Achieving Low Sample and Communication Complexities in
Decentralized Bilevel Learning over Networks [24.02913189682224]
Decentralized bilevel optimization problems have received increasing attention in the networking machine learning communities.
Low sample and communication complexities are two fundamental challenges that remain under-explored.
Our work is the first that both low sample and communication complexities for solving decentralized bilevel optimization problems over networks.
arXiv Detail & Related papers (2022-07-27T04:19:28Z) - On the Convergence of Distributed Stochastic Bilevel Optimization
Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models.
Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data.
We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z) - DESTRESS: Computation-Optimal and Communication-Efficient Decentralized
Nonconvex Finite-Sum Optimization [43.31016937305845]
Internet-of-things, networked sensing, autonomous systems and federated learning call for decentralized algorithms for finite-sum optimizations.
We develop DEcentralized STochastic REcurSive methodDESTRESS for non finite-sum optimization.
Detailed theoretical and numerical comparisons show that DESTRESS improves upon prior decentralized algorithms.
arXiv Detail & Related papers (2021-10-04T03:17:41Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.