Related papers: Approximate Gradient Coding for Distributed Learning with Heterogeneous Stragglers

Approximate Gradient Coding for Distributed Learning with Heterogeneous Stragglers

URL: http://arxiv.org/abs/2510.22539v1
Date: Sun, 26 Oct 2025 05:32:18 GMT
Title: Approximate Gradient Coding for Distributed Learning with Heterogeneous Stragglers
Authors: Heekang Song, Wan Choi,
Abstract summary: We propose an optimally structured gradient coding scheme to mitigate the straggler problem in distributed learning.<n>Our approach significantly reduces the impact of stragglers and accelerates convergence compared to existing methods.
Score: 8.873449722727026
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we propose an optimally structured gradient coding scheme to mitigate the straggler problem in distributed learning. Conventional gradient coding methods often assume homogeneous straggler models or rely on excessive data replication, limiting performance in real-world heterogeneous systems. To address these limitations, we formulate an optimization problem minimizing residual error while ensuring unbiased gradient estimation by explicitly considering individual straggler probabilities. We derive closed-form solutions for optimal encoding and decoding coefficients via Lagrangian duality and convex optimization, and propose data allocation strategies that reduce both redundancy and computation load. We also analyze convergence behavior for $\lambda$-strongly convex and $\mu$-smooth loss functions. Numerical results show that our approach significantly reduces the impact of stragglers and accelerates convergence compared to existing methods.

Related papers

Robust low-rank estimation with multiple binary responses using pairwise AUC loss [0.0]
Multiple binary responses arise in many modern data-analytic problems.<n>Low-rank models offer a natural way to encode latent dependence across tasks.<n>Existing methods for binary data are largely likelihood-based and focus on pointwise classification.
arXiv Detail & Related papers (2026-01-13T15:00:10Z)
Alternating Minimization Schemes for Computing Rate-Distortion-Perception Functions with $f$-Divergence Perception Constraints [9.788112471288057]
We study the computation of the rate-distortion-perception function (RDPF) for discrete memoryless sources.<n>We characterize optimal parametric solutions for the convex programming problem.<n>We show, by deriving necessary and sufficient conditions, that both schemes guarantee a globally optimal solution.
arXiv Detail & Related papers (2024-08-27T12:50:12Z)
Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization [55.115992622028685]
Previous work suggests that first-order methods would need to trade-off convergence rate (gradient convergence rate) for better. We demonstrate that both optimal complexity and near-optimal convergence guarantees can be achieved for smooth convex minimization and smooth convex-concave minimax problems.
arXiv Detail & Related papers (2023-10-26T19:56:52Z)
Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z)
Consistent Approximations in Composite Optimization [0.0]
We develop a framework for consistent approximations of optimization problems. The framework is developed for a broad class of optimizations. A programming analysis method illustrates extended nonlinear programming solutions.
arXiv Detail & Related papers (2022-01-13T23:57:08Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
Sparse Signal Reconstruction for Nonlinear Models via Piecewise Rational Optimization [27.080837460030583]
We propose a method to reconstruct degraded signals by a nonlinear distortion and at a limited sampling rate. Our method formulates as a non exact fitting term and a penalization term. It is shown how to use the problem in terms of the benefits of our simulations.
arXiv Detail & Related papers (2020-10-29T09:05:19Z)
Conditional gradient methods for stochastically constrained convex minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems. The most important feature of our framework is that only a subset of the constraints is processed at each iteration. Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.