Composite Optimization with Error Feedback: the Dual Averaging Approach
- URL: http://arxiv.org/abs/2510.03507v1
- Date: Fri, 03 Oct 2025 20:42:48 GMT
- Title: Composite Optimization with Error Feedback: the Dual Averaging Approach
- Authors: Yuan Gao, Anton Rodomanov, Jeremy Rack, Sebastian Stich,
- Abstract summary: Error Feedback (EF) methods fail in the broader and practically important setting of composite optimization.<n>We propose a novel method that combines Dual Averaging with EControl (Gao et al., 2024), a state-of-the-art variant of the EF mechanism.<n>We also provide a new and novel analysis template for inexact dual averaging method, which might be of independent interest.
- Score: 13.234182544232604
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Communication efficiency is a central challenge in distributed machine learning training, and message compression is a widely used solution. However, standard Error Feedback (EF) methods (Seide et al., 2014), though effective for smooth unconstrained optimization with compression (Karimireddy et al., 2019), fail in the broader and practically important setting of composite optimization, which captures, e.g., objectives consisting of a smooth loss combined with a non-smooth regularizer or constraints. The theoretical foundation and behavior of EF in the context of the general composite setting remain largely unexplored. In this work, we consider composite optimization with EF. We point out that the basic EF mechanism and its analysis no longer stand when a composite part is involved. We argue that this is because of a fundamental limitation in the method and its analysis technique. We propose a novel method that combines Dual Averaging with EControl (Gao et al., 2024), a state-of-the-art variant of the EF mechanism, and achieves for the first time a strong convergence analysis for composite optimization with error feedback. Along with our new algorithm, we also provide a new and novel analysis template for inexact dual averaging method, which might be of independent interest. We also provide experimental results to complement our theoretical findings.
Related papers
- L-SR1: Learned Symmetric-Rank-One Preconditioning [5.421390145168128]
End-to-end deep learning has achieved impressive results but remains limited by its reliance on large labeled datasets.<n>In contrast, classical optimization methods are data-efficient and lightweight but often suffer from slow convergence.<n>We propose a novel learned second-order vectors that introduces a trainable preconditioning unit to enhance the classical Symmetric-Rank-One algorithm.
arXiv Detail & Related papers (2025-08-17T07:37:29Z) - MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes [57.24311218570012]
We extend the non-smooth convex theory of EF21-P (Anonymous 2024) and MARINA-P (arXiv:2402.06412) in the non-size convex setting.<n>We provide theoretical guarantees under constant, decreasing, and adaptive (aktypetype) steps.
arXiv Detail & Related papers (2024-12-22T16:18:34Z) - Constrained Bayesian Optimization Under Partial Observations: Balanced
Improvements and Provable Convergence [6.461785985849886]
We endeavor to design an efficient and provable method for expensive POCOPs under the framework of constrained Bayesian optimization.
We present an improved design of the acquisition functions that introduces balanced exploration during optimization.
We propose a Gaussian process embedding different likelihoods as the surrogate model for a partially observable constraint.
arXiv Detail & Related papers (2023-12-06T01:00:07Z) - Momentum Provably Improves Error Feedback! [54.93799845077906]
When untreated, errors caused by compression propagate exponential training behavior.
EF21-SGDM improves the communication and sample complexities of previous error feedback algorithms.
arXiv Detail & Related papers (2023-05-24T13:52:02Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Optimizing Two-way Partial AUC with an End-to-end Framework [154.47590401735323]
Area Under the ROC Curve (AUC) is a crucial metric for machine learning.
Recent work shows that the TPAUC is essentially inconsistent with the existing Partial AUC metrics.
We present the first trial in this paper to optimize this new metric.
arXiv Detail & Related papers (2022-06-23T12:21:30Z) - Optimizer Amalgamation [124.33523126363728]
We are motivated to study a new problem named Amalgamation: how can we best combine a pool of "teacher" amalgamations into a single "student" that can have stronger problem-specific performance?
First, we define three differentiable mechanisms to amalgamate a pool of analyticals by gradient descent.
In order to reduce variance of the process, we also explore methods to stabilize the process by perturbing the target.
arXiv Detail & Related papers (2022-03-12T16:07:57Z) - EF21 with Bells & Whistles: Six Algorithmic Extensions of Modern Error Feedback [65.60461294185853]
We propose a new error feedback mechanism, EF21, based on the construction of a Markov compressor induced by a contractive compressor.<n>Several of these techniques have not been previously analyzed in combination with EF21.
arXiv Detail & Related papers (2021-10-07T09:29:14Z) - EF21: A New, Simpler, Theoretically Better, and Practically Faster Error
Feedback [0.0]
Error feedback (EF) is an immensely popular stabilization mechanism in the context of distributed training of supervised machine learning.
We propose and analyze a new EF mechanism, which we call EF21, which consistently and substantially outperforms in practice.
In particular, we prove that EF21 enjoys a fast $O(1/T)$ convergence rate for smooth non convergence problems.
arXiv Detail & Related papers (2021-06-09T16:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.