Related papers: QBI: Quantile-Based Bias Initialization for Efficient Private Data Reconstruction in Federated Learning

QBI: Quantile-Based Bias Initialization for Efficient Private Data Reconstruction in Federated Learning

URL: http://arxiv.org/abs/2406.18745v2
Date: Thu, 26 Sep 2024 18:36:32 GMT
Title: QBI: Quantile-Based Bias Initialization for Efficient Private Data Reconstruction in Federated Learning
Authors: Micha V. Nowak, Tim P. Bott, David Khachaturov, Frank Puppe, Adrian Krenzer, Amar Hekalo,
Abstract summary: Federated learning enables the training of machine learning models on distributed data without compromising user privacy. Recent research has shown that the central entity can perfectly reconstruct private data from shared model updates.
Score: 0.5497663232622965
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated learning enables the training of machine learning models on distributed data without compromising user privacy, as data remains on personal devices and only model updates, such as gradients, are shared with a central coordinator. However, recent research has shown that the central entity can perfectly reconstruct private data from shared model updates by maliciously initializing the model's parameters. In this paper, we propose QBI, a novel bias initialization method that significantly enhances reconstruction capabilities. This is accomplished by directly solving for bias values yielding sparse activation patterns. Further, we propose PAIRS, an algorithm that builds on QBI. PAIRS can be deployed when a separate dataset from the target domain is available to further increase the percentage of data that can be fully recovered. Measured by the percentage of samples that can be perfectly reconstructed from batches of various sizes, our approach achieves significant improvements over previous methods with gains of up to 50% on ImageNet and up to 60% on the IMDB sentiment analysis text dataset. Furthermore, we establish theoretical limits for attacks leveraging stochastic gradient sparsity, providing a foundation for understanding the fundamental constraints of these attacks. We empirically assess these limits using synthetic datasets. Finally, we propose and evaluate AGGP, a defensive framework designed to prevent gradient sparsity attacks, contributing to the development of more secure and private federated learning systems.

Related papers

Generalization is not a universal guarantee: Estimating similarity to training data with an ensemble out-of-distribution metric [0.09363323206192666]
Failure of machine learning models to generalize to new data is a core problem limiting the reliability of AI systems. We propose a standardized approach for assessing data similarity by constructing a supervised autoencoder for generalizability estimation (SAGE) We show that out-of-the-box model performance increases after SAGE score filtering, even when applied to data from the model's own training and test datasets.
arXiv Detail & Related papers (2025-02-22T19:21:50Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server. Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples. We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z)
Approximate and Weighted Data Reconstruction Attack in Federated Learning [1.802525429431034]
distributed learning (FL) enables clients to collaborate on building a machine learning model without sharing their private data. Recent data reconstruction attacks demonstrate that an attacker can recover clients' training data based on the parameters shared in FL. We propose an approximation method, which makes attacking FedAvg scenarios feasible by generating the intermediate model updates of the clients' local training processes.
arXiv Detail & Related papers (2023-08-13T17:40:56Z)
Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data. Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z)
FedCC: Robust Federated Learning against Model Poisoning Attacks [0.0]
Federated Learning is designed to address privacy concerns in learning models. New distributed paradigm safeguards data privacy but differentiates the attack surface due to the server's inaccessibility to local datasets.
arXiv Detail & Related papers (2022-12-05T01:52:32Z)
Robbing the Fed: Directly Obtaining Private Data in Federated Learning with Modified Models [56.0250919557652]
Federated learning has quickly gained popularity with its promises of increased user privacy and efficiency. Previous attacks on user privacy have been limited in scope and do not scale to gradient updates aggregated over even a handful of data points. We introduce a new threat model based on minimal but malicious modifications of the shared model architecture.
arXiv Detail & Related papers (2021-10-25T15:52:06Z)
Secure Neuroimaging Analysis using Federated Learning with Homomorphic Encryption [14.269757725951882]
Federated learning (FL) enables distributed computation of machine learning models over disparate, remote data sources. Recent membership attacks show that private or sensitive personal data can sometimes be leaked or inferred when model parameters or summary statistics are shared with a central site. We propose a framework for secure FL using fully-homomorphic encryption (FHE)
arXiv Detail & Related papers (2021-08-07T12:15:52Z)
R-GAP: Recursive Gradient Attack on Privacy [5.687523225718642]
Federated learning is a promising approach to break the dilemma between demands on privacy and the promise of learning from large collections of distributed data. We provide a closed-form recursion procedure to recover data from gradients in deep neural networks. We also propose a Rank Analysis method to estimate the risk of gradient attacks inherent in certain network architectures.
arXiv Detail & Related papers (2020-10-15T13:22:40Z)
Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters. We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data. Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
GRAFFL: Gradient-free Federated Learning of a Bayesian Generative Model [8.87104231451079]
This paper presents the first gradient-free federated learning framework called GRAFFL. It uses implicit information derived from each participating institution to learn posterior distributions of parameters. We propose the GRAFFL-based Bayesian mixture model to serve as a proof-of-concept of the framework.
arXiv Detail & Related papers (2020-08-29T07:19:44Z)
Privacy-preserving Traffic Flow Prediction: A Federated Learning Approach [61.64006416975458]
We propose a privacy-preserving machine learning technique named Federated Learning-based Gated Recurrent Unit neural network algorithm (FedGRU) for traffic flow prediction. FedGRU differs from current centralized learning methods and updates universal learning models through a secure parameter aggregation mechanism. It is shown that FedGRU's prediction accuracy is 90.96% higher than the advanced deep learning models.
arXiv Detail & Related papers (2020-03-19T13:07:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.