Data Leakage in Federated Averaging
- URL: http://arxiv.org/abs/2206.12395v2
- Date: Mon, 27 Jun 2022 16:05:25 GMT
- Title: Data Leakage in Federated Averaging
- Authors: Dimitar I. Dimitrov, Mislav Balunovi\'c, Nikola Konstantinov, Martin
Vechev
- Abstract summary: Recent attacks have shown that user data can be recovered from FedSGD updates, thus breaking privacy.
These attacks are of limited practical relevance as federated learning typically uses the FedAvg algorithm.
We propose a new optimization-based attack which successfully attacks FedAvg.
- Score: 12.492818918629101
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent attacks have shown that user data can be recovered from FedSGD
updates, thus breaking privacy. However, these attacks are of limited practical
relevance as federated learning typically uses the FedAvg algorithm. Compared
to FedSGD, recovering data from FedAvg updates is much harder as: (i) the
updates are computed at unobserved intermediate network weights, (ii) a large
number of batches are used, and (iii) labels and network weights vary
simultaneously across client steps. In this work, we propose a new
optimization-based attack which successfully attacks FedAvg by addressing the
above challenges. First, we solve the optimization problem using automatic
differentiation that forces a simulation of the client's update that generates
the unobserved parameters for the recovered labels and inputs to match the
received client update. Second, we address the large number of batches by
relating images from different epochs with a permutation invariant prior.
Third, we recover the labels by estimating the parameters of existing FedSGD
attacks at every FedAvg step. On the popular FEMNIST dataset, we demonstrate
that on average we successfully recover >45% of the client's images from
realistic FedAvg updates computed on 10 local epochs of 10 batches each with 5
images, compared to only <10% using the baseline. Our findings show many
real-world federated learning implementations based on FedAvg are vulnerable.
Related papers
- FedStale: leveraging stale client updates in federated learning [10.850101961203748]
Federated learning algorithms are negatively affected by data heterogeneity and partial client participation.
This paper shows that, when some clients participate much less than others, aggregating updates with different levels of staleness can detrimentally affect the training process.
We introduce FedStale, a novel algorithm that updates the global model in each round through a convex combination of "fresh" updates from participating clients and "stale" updates from non-participating ones.
arXiv Detail & Related papers (2024-05-07T10:11:42Z) - Robust Federated Learning Mitigates Client-side Training Data Distribution Inference Attacks [48.70867241987739]
InferGuard is a novel Byzantine-robust aggregation rule aimed at defending against client-side training data distribution inference attacks.
The results of our experiments indicate that our defense mechanism is highly effective in protecting against client-side training data distribution inference attacks.
arXiv Detail & Related papers (2024-03-05T17:41:35Z) - Leveraging Function Space Aggregation for Federated Learning at Scale [20.866482460590973]
We propose a new algorithm, FedFish, that aggregates local approximations to the functions learned by clients.
We evaluate FedFish on realistic, large-scale cross-device benchmarks.
arXiv Detail & Related papers (2023-11-17T02:37:10Z) - Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server.
Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples.
We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z) - LOKI: Large-scale Data Reconstruction Attack against Federated Learning
through Model Manipulation [25.03733882637947]
We introduce LOKI, an attack that overcomes previous limitations and also breaks the anonymity of aggregation.
With FedAVG and aggregation across 100 clients, prior work can leak less than 1% of images on MNIST, CIFAR-100, and Tiny ImageNet.
Using only a single training round, LOKI is able to leak 76-86% of all data samples.
arXiv Detail & Related papers (2023-03-21T23:29:35Z) - FedSkip: Combatting Statistical Heterogeneity with Federated Skip
Aggregation [95.85026305874824]
We introduce a data-driven approach called FedSkip to improve the client optima by periodically skipping federated averaging and scattering local models to the cross devices.
We conduct extensive experiments on a range of datasets to demonstrate that FedSkip achieves much higher accuracy, better aggregation efficiency and competing communication efficiency.
arXiv Detail & Related papers (2022-12-14T13:57:01Z) - FedAvg with Fine Tuning: Local Updates Lead to Representation Learning [54.65133770989836]
Federated Averaging (FedAvg) algorithm consists of alternating between a few local gradient updates at client nodes, followed by a model averaging update at the server.
We show that the reason behind generalizability of the FedAvg's output is its power in learning the common data representation among the clients' tasks.
We also provide empirical evidence demonstrating FedAvg's representation learning ability in federated image classification with heterogeneous data.
arXiv Detail & Related papers (2022-05-27T00:55:24Z) - Optimizing Performance of Federated Person Re-identification:
Benchmarking and Analysis [14.545746907150436]
FedReID implements federated learning, an emerging distributed training method, to person ReID.
FedReID preserves data privacy by aggregating model updates, instead of raw data, from clients to a central server.
arXiv Detail & Related papers (2022-05-24T15:20:32Z) - Robust Quantity-Aware Aggregation for Federated Learning [72.59915691824624]
Malicious clients can poison model updates and claim large quantities to amplify the impact of their model updates in the model aggregation.
Existing defense methods for FL, while all handling malicious model updates, either treat all quantities benign or simply ignore/truncate the quantities of all clients.
We propose a robust quantity-aware aggregation algorithm for federated learning, called FedRA, to perform the aggregation with awareness of local data quantities.
arXiv Detail & Related papers (2022-05-22T15:13:23Z) - TOFU: Towards Obfuscated Federated Updates by Encoding Weight Updates
into Gradients from Proxy Data [7.489265323050362]
We propose TOFU, a novel algorithm which generates proxy data that encodes the weight updates for each client in its gradients.
We show that TOFU enables learning with less than 1% and 7% accuracy drops on MNIST and on CIFAR-10 datasets.
This enables us to learn to near-full accuracy in a federated setup, while being 4x and 6.6x more communication efficient than the standard Federated Averaging algorithm.
arXiv Detail & Related papers (2022-01-21T00:25:42Z) - Robbing the Fed: Directly Obtaining Private Data in Federated Learning
with Modified Models [56.0250919557652]
Federated learning has quickly gained popularity with its promises of increased user privacy and efficiency.
Previous attacks on user privacy have been limited in scope and do not scale to gradient updates aggregated over even a handful of data points.
We introduce a new threat model based on minimal but malicious modifications of the shared model architecture.
arXiv Detail & Related papers (2021-10-25T15:52:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.