Related papers: FedOSAA: Improving Federated Learning with One-Step Anderson Acceleration

FedOSAA: Improving Federated Learning with One-Step Anderson Acceleration

URL: http://arxiv.org/abs/2503.10961v1
Date: Fri, 14 Mar 2025 00:10:02 GMT
Title: FedOSAA: Improving Federated Learning with One-Step Anderson Acceleration
Authors: Xue Feng, M. Paul Laiu, Thomas Strohmer,
Abstract summary: Federated learning (FL) is a distributed machine learning approach that enables multiple local clients and a central server to collaboratively train a model.<n>First-order methods, particularly those incorporating variance reduction techniques, are the most widely used FL algorithms due to their simple implementation and stable performance.<n>We propose FedOSAA, a novel approach that preserves the simplicity of first-order methods while achieving the rapid convergence typically associated with second-order methods.
Score: 3.096113258362507
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning (FL) is a distributed machine learning approach that enables multiple local clients and a central server to collaboratively train a model while keeping the data on their own devices. First-order methods, particularly those incorporating variance reduction techniques, are the most widely used FL algorithms due to their simple implementation and stable performance. However, these methods tend to be slow and require a large number of communication rounds to reach the global minimizer. We propose FedOSAA, a novel approach that preserves the simplicity of first-order methods while achieving the rapid convergence typically associated with second-order methods. Our approach applies one Anderson acceleration (AA) step following classical local updates based on first-order methods with variance reduction, such as FedSVRG and SCAFFOLD, during local training. This AA step is able to leverage curvature information from the history points and gives a new update that approximates the Newton-GMRES direction, thereby significantly improving the convergence. We establish a local linear convergence rate to the global minimizer of FedOSAA for smooth and strongly convex loss functions. Numerical comparisons show that FedOSAA substantially improves the communication and computation efficiency of the original first-order methods, achieving performance comparable to second-order methods like GIANT.

Related papers

Covariances for Free: Exploiting Mean Distributions for Federated Learning with Pre-Trained Models [19.74434265098346]
Recent works have investigated the use of first-order statistics and second-order statistics to aggregate local client data distributions at the server.<n>We propose a training-free method based on an unbiased estimator of class covariance matrices.<n>Our method, which only uses first-order statistics in the form of class means communicated by clients to the server, incurs only a fraction of the communication costs required by methods based on communicating second-order statistics.
arXiv Detail & Related papers (2024-12-18T20:40:14Z)
Inferring Neural Signed Distance Functions by Overfitting on Single Noisy Point Clouds through Finetuning Data-Driven based Priors [53.6277160912059]
We propose a method to promote pros of data-driven based and overfitting-based methods for better generalization, faster inference, and higher accuracy in learning neural SDFs. We introduce a novel statistical reasoning algorithm in local regions which is able to finetune data-driven based priors without signed distance supervision, clean point cloud, or point normals.
arXiv Detail & Related papers (2024-10-25T16:48:44Z)
Towards Hyper-parameter-free Federated Learning [1.3682156035049038]
We introduce algorithms for automated scaling of global model updates. In first algorithm, we establish that a descent-ensuring step-size regime at the clients ensures descent for the server objective. Second algorithm shows that the average of objective values of sampled clients is a practical and effective substitute for the value server required for computing the scaling factor.
arXiv Detail & Related papers (2024-08-30T09:35:36Z)
Federated Optimization with Doubly Regularized Drift Correction [20.30761752651984]
Federated learning is a distributed optimization paradigm that allows training machine learning models across decentralized devices while keeping the data localized. Previous works proposed various strategies to mitigate drift, yet none have shown uniformly improved communication-computation trade-offs over vanilla gradient descent. We show that (i) DANE can achieve the desired communication reduction under Hessian similarity constraints, and (ii) we present an extension, DANE+, which supports arbitrary inexact local solvers. We propose (iii) a novel method, FedRed, which has improved local computational complexity and retains the same communication complexity compared to DANE/D
arXiv Detail & Related papers (2024-04-12T12:57:43Z)
On Principled Local Optimization Methods for Federated Learning [2.628859872479184]
dissertation aims to advance the theoretical foundation of local methods in the following three directions. First, we establish sharp bounds for FedAvg, the most popular algorithm in Federated Learning. Second, we propose Federated Accelerated Descent (FedAc), which provably improves the convergence rate and communication efficiency.
arXiv Detail & Related papers (2024-01-24T03:57:45Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data. We propose textbfFedDA, a novel framework for local adaptive gradient methods. We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z)
FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning [75.46959684676371]
We introduce a novel framework called FedNew in which there is no need to transmit Hessian information from clients to PS. FedNew hides the gradient information and results in a privacy-preserving approach compared to the existing state-of-the-art.
arXiv Detail & Related papers (2022-06-17T15:21:39Z)
Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples. We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment. We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z)
FedChain: Chained Algorithms for Near-Optimal Communication Cost in Federated Learning [24.812767482563878]
Federated learning (FL) aims to minimize the communication complexity of training a model over heterogeneous data distributed across many clients. We propose FedChain, an algorithmic framework that combines the strengths of local methods and global methods to achieve fast convergence in terms of R.
arXiv Detail & Related papers (2021-08-16T02:57:06Z)
FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data. To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.