Related papers: Non-Convex Federated Optimization under Cost-Aware Client Selection

Non-Convex Federated Optimization under Cost-Aware Client Selection

URL: http://arxiv.org/abs/2512.05327v1
Date: Fri, 05 Dec 2025 00:10:42 GMT
Title: Non-Convex Federated Optimization under Cost-Aware Client Selection
Authors: Xiaowen Jiang, Anton Rodomanov, Sebastian U. Stich,
Abstract summary: We introduce a model of federated optimization that quantifies communication and local complexities.<n>We then introduce the Recursive-Gradient technique as a general way to potentially improve the error bound of a gradient estimator.<n>By applying this technique to SAGA, we obtain a new estimator, RGSAGA, which has an improved error bound compared to the original one.
Score: 36.2080705403997
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Different federated optimization algorithms typically employ distinct client-selection strategies: some methods communicate only with a randomly sampled subset of clients at each round, while others need to periodically communicate with all clients or use a hybrid scheme that combines both strategies. However, existing metrics for comparing optimization methods typically do not distinguish between these strategies, which often incur different communication costs in practice. To address this disparity, we introduce a simple and natural model of federated optimization that quantifies communication and local computation complexities. This new model allows for several commonly used client-selection strategies and explicitly associates each with a distinct cost. Within this setting, we propose a new algorithm that achieves the best-known communication and local complexities among existing federated optimization methods for non-convex optimization. This algorithm is based on the inexact composite gradient method with a carefully constructed gradient estimator and a special procedure for solving the auxiliary subproblem at each iteration. The gradient estimator is based on SAGA, a popular variance-reduced gradient estimator. We first derive a new variance bound for it, showing that SAGA can exploit functional similarity. We then introduce the Recursive-Gradient technique as a general way to potentially improve the error bound of a given conditionally unbiased gradient estimator, including both SAGA and SVRG. By applying this technique to SAGA, we obtain a new estimator, RG-SAGA, which has an improved error bound compared to the original one.

Related papers

Closing the Generalization Gap in Parameter-efficient Federated Edge Learning [43.00634399799955]
Federated edge learning (FEEL) provides a promising foundation for artificial intelligence (AI)<n>limited and heterogeneous local datasets, as well as resource-constrained deployment, severely degrade both model generalization and resource utilization.<n>We propose a framework that jointly leverages model minimization and generalization selection to tackle such challenges.
arXiv Detail & Related papers (2025-11-28T15:34:09Z)
Primal-dual algorithm for contextual stochastic combinatorial optimization [1.4999444543328293]
This paper introduces a novel approach to contextual optimization, integrating operations research and machine learning to address decision-making under uncertainty.<n>Our goal is to minimize the empirical risk, which is estimated from past data on uncertain parameters and contexts.
arXiv Detail & Related papers (2025-05-07T19:37:12Z)
Generalization Bounds of Surrogate Policies for Combinatorial Optimization Problems [53.03951222945921]
We analyze smoothed (perturbed) policies, adding controlled random perturbations to the direction used by the linear oracle.<n>Our main contribution is a generalization bound that decomposes the excess risk into perturbation bias, statistical estimation error, and optimization error.<n>We illustrate the scope of the results on applications such as vehicle scheduling, highlighting how smoothing enables both tractable training and controlled generalization.
arXiv Detail & Related papers (2024-07-24T12:00:30Z)
Federated Zeroth-Order Optimization using Trajectory-Informed Surrogate Gradients [31.674600866528788]
We introduce trajectory-informed surrogate gradients (FZooS) algorithm for query- and communication-efficient federated ZOO. Our FZooS achieves theoretical improvements over the existing approaches, which is supported by our real-world experiments such as federated black-box adversarial attack and federated non-differentiable metric optimization.
arXiv Detail & Related papers (2023-08-08T06:26:54Z)
GradSkip: Communication-Accelerated Local Gradient Methods with Better Computational Complexity [54.585248253601314]
We study a class of distributed optimization algorithms that aim to alleviate high communication costs by allowing clients to perform multiple local gradient-type training steps before communication.<n>In particular, we prove that our modified method, GradSkip, converges linearly under the same assumptions and has the same accelerated communication complexity.
arXiv Detail & Related papers (2022-10-28T20:59:06Z)
On Accelerating Distributed Convex Optimizations [0.0]
This paper studies a distributed multi-agent convex optimization problem. We show that the proposed algorithm converges linearly with an improved rate of convergence than the traditional and adaptive gradient-descent methods. We demonstrate our algorithm's superior performance compared to prominent distributed algorithms for solving real logistic regression problems.
arXiv Detail & Related papers (2021-08-19T13:19:54Z)
FedChain: Chained Algorithms for Near-Optimal Communication Cost in Federated Learning [24.812767482563878]
Federated learning (FL) aims to minimize the communication complexity of training a model over heterogeneous data distributed across many clients. We propose FedChain, an algorithmic framework that combines the strengths of local methods and global methods to achieve fast convergence in terms of R.
arXiv Detail & Related papers (2021-08-16T02:57:06Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
Divide and Learn: A Divide and Conquer Approach for Predict+Optimize [50.03608569227359]
The predict+optimize problem combines machine learning ofproblem coefficients with a optimization prob-lem that uses the predicted coefficients. We show how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function. We propose a novel divide and algorithm to tackle optimization problems without this restriction and predict itscoefficients using the optimization loss.
arXiv Detail & Related papers (2020-12-04T00:26:56Z)
Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems. We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data. To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.