Related papers: Byzantine-resilient Federated Learning With Adaptivity to Data Heterogeneity

Byzantine-resilient Federated Learning With Adaptivity to Data Heterogeneity

URL: http://arxiv.org/abs/2403.13374v3
Date: Wed, 27 Mar 2024 14:57:54 GMT
Title: Byzantine-resilient Federated Learning With Adaptivity to Data Heterogeneity
Authors: Shiyuan Zuo, Xingrun Yan, Rongfei Fan, Han Hu, Hangguan Shan, Tony Q. S. Quek,
Abstract summary: This paper deals with Gradient learning (FL) in the presence of malicious attacks Byzantine data. A novel Average Algorithm (RAGA) is proposed, which leverages robustness aggregation and can select a dataset.
Score: 54.145730036889496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper deals with federated learning (FL) in the presence of malicious Byzantine attacks and data heterogeneity. A novel Robust Average Gradient Algorithm (RAGA) is proposed, which leverages the geometric median for aggregation and can freely select the round number for local updating. Different from most existing resilient approaches, which perform convergence analysis based on strongly-convex loss function or homogeneously distributed dataset, we conduct convergence analysis for not only strongly-convex but also non-convex loss function over heterogeneous dataset. According to our theoretical analysis, as long as the fraction of dataset from malicious users is less than half, RAGA can achieve convergence at rate $\mathcal{O}({1}/{T^{2/3- \delta}})$ where $T$ is the iteration number and $\delta \in (0, 2/3)$ for non-convex loss function, and at linear rate for strongly-convex loss function. Moreover, stationary point or global optimal solution is proved to obtainable as data heterogeneity vanishes. Experimental results corroborate the robustness of RAGA to Byzantine attacks and verifies the advantage of RAGA over baselines on convergence performance under various intensity of Byzantine attacks, for heterogeneous dataset.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
FedCanon: Non-Convex Composite Federated Learning with Efficient Proximal Operation on Heterogeneous Data [17.80715992954134]
Composite learning offers a general framework for solving machine learning problems with additional regularization terms. We propose FedCanon algorithm to solve possibly non-smooth regularization problems.
arXiv Detail & Related papers (2025-04-16T09:28:26Z)
Byzantine-resilient Federated Learning Employing Normalized Gradients on Non-IID Datasets [23.640506243685863]
In practical federated learning (FLNGA) the presence of malicious attacks and data heterogeneity often introduces biases into the learning process. We propose a Normalized Gradient unit (Fed-M) model which normalizes uploaded local gradients to be before aggregation, achieving a time of $mathcalO(pM)$.
arXiv Detail & Related papers (2024-08-18T16:50:39Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Globally Convergent Accelerated Algorithms for Multilinear Sparse Logistic Regression with $\ell_0$-constraints [2.323238724742687]
Multilinear logistic regression serves as a powerful tool for the analysis of multidimensional data. We propose an Accelerated Proximal Alternating Minim-MLSR model to solve the $ell_0$-MLSR. We also demonstrate that APALM$+$ is globally convergent to a first-order critical point as well as to establish convergence by using the Kurdy-Lojasiewicz property.
arXiv Detail & Related papers (2023-09-17T11:05:08Z)
On the Size and Approximation Error of Distilled Sets [57.61696480305911]
We take a theoretical view on kernel ridge regression based methods of dataset distillation such as Kernel Inducing Points. We prove that a small set of instances exists in the original input space such that its solution in the RFF space coincides with the solution of the original data. A KRR solution can be generated using this distilled set of instances which gives an approximation towards the KRR solution optimized on the full input data.
arXiv Detail & Related papers (2023-05-23T14:37:43Z)
Improved Analysis of Score-based Generative Modeling: User-Friendly Bounds under Minimal Smoothness Assumptions [9.953088581242845]
We provide convergence guarantees with complexity for any data distribution with second-order moment. Our result does not rely on any log-concavity or functional inequality assumption. Our theoretical analysis provides comparison between different discrete approximations and may guide the choice of discretization points in practice.
arXiv Detail & Related papers (2022-11-03T15:51:00Z)
Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order. We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z)
Robust Training in High Dimensions via Block Coordinate Geometric Median Descent [69.47594803719333]
Geometric median (textGm) is a classical method in statistics for achieving a robust estimation of the uncorrupted data. In this paper, we show that by that by applying textscGm to only a chosen block of coordinates at a time, one can retain a breakdown point of 0.5 judiciously for smooth nontext problems.
arXiv Detail & Related papers (2021-06-16T15:55:50Z)
Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees. Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z)
Byzantine-Resilient SGD in High Dimensions on Heterogeneous Data [10.965065178451104]
We study distributed gradient descent (SGD) in the master-worker architecture under Byzantine attacks. Our algorithm can tolerate up to $frac14$ fraction Byzantine workers.
arXiv Detail & Related papers (2020-05-16T04:15:27Z)
Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees. Our approach leads to significantly better bounds for datasets with known rates of singular value decay. We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.