Related papers: A Class of Accelerated Fixed-Point-Based Methods with Delayed Inexact Oracles and Its Applications

A Class of Accelerated Fixed-Point-Based Methods with Delayed Inexact Oracles and Its Applications

URL: http://arxiv.org/abs/2512.13547v1
Date: Mon, 15 Dec 2025 17:06:22 GMT
Title: A Class of Accelerated Fixed-Point-Based Methods with Delayed Inexact Oracles and Its Applications
Authors: Nghia Nguyen-Trung, Quoc Tran-Dinh,
Abstract summary: We develop a fixed-point-based framework using delayed inexact oracles to approximate a fixed point of a nonexpansive operator.<n>Our approach leverages both Nesterov's acceleration technique and the Krasnosel'skii-Mann (KM) iteration.
Score: 3.6997773420183866
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we develop a novel accelerated fixed-point-based framework using delayed inexact oracles to approximate a fixed point of a nonexpansive operator (or equivalently, a root of a co-coercive operator), a central problem in scientific computing. Our approach leverages both Nesterov's acceleration technique and the Krasnosel'skii-Mann (KM) iteration, while accounting for delayed inexact oracles, a key mechanism in asynchronous algorithms. We also introduce a unified approximate error condition for delayed inexact oracles, which can cover various practical scenarios. Under mild conditions and appropriate parameter updates, we establish both $\mathcal{O}(1/k^2)$ non-asymptotic and $o(1/k^2)$ asymptotic convergence rates in expectation for the squared norm of residual. Our rate significantly improves the $\mathcal{O}(1/k)$ rates in classical KM-type methods, including their asynchronous variants. We also establish $o(1/k^2)$ almost sure convergence rates and the almost sure convergence of iterates to a solution of the problem. Within our framework, we instantiate three settings for the underlying operator: (i) a deterministic universal delayed oracle; (ii) a stochastic delayed oracle; and (iii) a finite-sum structure with asynchronous updates. For each case, we instantiate our framework to obtain a concrete algorithmic variant for which our convergence results still apply, and whose iteration complexity depends linearly on the maximum delay. Finally, we verify our algorithms and theoretical results through two numerical examples on both matrix game and shallow neural network training problems.

Related papers

VFOG: Variance-Reduced Fast Optimistic Gradient Methods for a Class of Nonmonotone Generalized Equations [3.6997773420183866]
We develop a novel optimistic gradient-type algorithmic framework, combining both Nesterov's acceleration and variance-reduction techniques.<n>We show that our method achieves $mathcalO (1/k2)$ convergence rates in expectation on the squared norm of residual under the Lipschitz continuity.<n>We show that the sequence of iterates of our method almost surely converges to a solution of the underlying problem.
arXiv Detail & Related papers (2025-08-22T20:46:29Z)
Variance-Reduced Fast Operator Splitting Methods for Generalized Equations [8.0153031008486]
We develop two variance-reduced fast operator splitting methods to approximate solutions of a class of generalized equations.<n>Our approach integrates recent advances in accelerated operator splitting and fixed-point methods, co-hypomonotonicity, and variance reduction.
arXiv Detail & Related papers (2025-04-17T16:02:20Z)
A Stochastic Approximation Approach for Efficient Decentralized Optimization on Random Networks [21.66341372216097]
A challenging problem in decentralized optimization is to develop algorithms with fast convergence on random time topologies under unreliable bandwidth-constrained communication network.<n>This paper introduces a novel approximation approach with a Fully Primal Dual Algorithm (FSPDA) framework.<n> Numerical experiments show the benefits of the FSPDA algorithms.
arXiv Detail & Related papers (2024-10-24T14:26:58Z)
Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling [73.5602474095954]
We study the non-asymptotic performance of approximation schemes with delayed updates under Markovian sampling. Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms.
arXiv Detail & Related papers (2024-02-19T03:08:02Z)
Fast Nonlinear Two-Time-Scale Stochastic Approximation: Achieving $O(1/k)$ Finite-Sample Complexity [2.5382095320488665]
This paper proposes to develop a new variant of the two-time-scale monotone approximation to find the roots of two coupled nonlinear operators. Our key idea is to leverage the classic Ruppert-Polyak averaging technique to dynamically estimate the operators through their samples. The estimated values of these averaging steps will then be used in the two-time-scale approximation updates to find the desired solution.
arXiv Detail & Related papers (2024-01-23T13:44:15Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent [63.43247232708004]
A gradient descent performed in an asynchronous manner plays a crucial role in training large-scale machine learning models.<n>Existing generalization error bounds are rather pessimistic and cannot reveal the correlation between asynchronous delays and generalization.<n>Our theoretical results indicate that asynchronous delays reduce the generalization error of the delayed SGD algorithm.
arXiv Detail & Related papers (2023-08-18T10:00:27Z)
Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate Gradients [13.848861021326755]
We propose Mirror Descent Optimal Transport (MDOT) as a novel method for solving discrete optimal transport (OT) problems with high precision.<n>We solve each problem efficiently using a GPU-parallel nonlinear conjugate algorithm (PNCG) that outperforms traditional Sinkhorn iterations under weak regularization.
arXiv Detail & Related papers (2023-07-17T14:09:43Z)
Explicit Second-Order Min-Max Optimization: Practical Algorithms and Complexity Analysis [71.05708939639537]
We propose and analyze several inexact regularized Newton-type methods for finding a global saddle point of emphconcave unconstrained problems.<n>Our method improves the existing line-search-based min-max optimization by shaving off an $O(loglog(1/eps)$ factor in the required number of Schur decompositions.
arXiv Detail & Related papers (2022-10-23T21:24:37Z)
Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning [77.22019100456595]
We show a training algorithm for distributed computation workers with varying communication frequency. In this work, we obtain a tighter convergence rate of $mathcalO!!!(sigma2-2_avg!! . We also show that the heterogeneity term in rate is affected by the average delay within each worker.
arXiv Detail & Related papers (2022-06-16T17:10:57Z)
Accelerated and instance-optimal policy evaluation with linear function approximation [17.995515643150657]
Existing algorithms fail to match at least one of these lower bounds. We develop an accelerated, variance-reduced fast temporal difference algorithm that simultaneously matches both lower bounds and attains a strong notion of instance-optimality.
arXiv Detail & Related papers (2021-12-24T17:21:04Z)
Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD) We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z)
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms [58.57004511121862]
Actor-critic (AC) and natural actor-critic (NAC) algorithms are often executed in two ways for finding optimal policies. We show that two time-scale AC requires the overall sample complexity at the order of $mathcalO(epsilon-2.5log3(epsilon-1))$ to attain an $epsilon$-accurate stationary point. We develop novel techniques for bounding the bias error of the actor due to dynamically changing Markovian sampling.
arXiv Detail & Related papers (2020-05-07T15:42:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.