Related papers: Hidden Data Privacy Breaches in Federated Learning

Hidden Data Privacy Breaches in Federated Learning

URL: http://arxiv.org/abs/2411.18269v1
Date: Wed, 27 Nov 2024 12:04:37 GMT
Title: Hidden Data Privacy Breaches in Federated Learning
Authors: Xueluan Gong, Yuji Wang, Shuaike Li, Mengyuan Sun, Songze Li, Qian Wang, Kwok-Yan Lam, Chen Chen,
Abstract summary: Federated Learning (FL) emerged as a paradigm for conducting machine learning across broad and decentralized datasets.<n>Recent studies show that attackers can steal private data through model manipulation or gradient analysis.<n>We propose a novel data-reconstruction attack leveraging malicious code injection, supported by two key techniques.
Score: 24.47236055167954
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated Learning (FL) emerged as a paradigm for conducting machine learning across broad and decentralized datasets, promising enhanced privacy by obviating the need for direct data sharing. However, recent studies show that attackers can steal private data through model manipulation or gradient analysis. Existing attacks are constrained by low theft quantity or low-resolution data, and they are often detected through anomaly monitoring in gradients or weights. In this paper, we propose a novel data-reconstruction attack leveraging malicious code injection, supported by two key techniques, i.e., distinctive and sparse encoding design and block partitioning. Unlike conventional methods that require detectable changes to the model, our method stealthily embeds a hidden model using parameter sharing to systematically extract sensitive data. The Fibonacci-based index design ensures efficient, structured retrieval of memorized data, while the block partitioning method enhances our method's capability to handle high-resolution images by dividing them into smaller, manageable units. Extensive experiments on 4 datasets confirmed that our method is superior to the five state-of-the-art data-reconstruction attacks under the five respective detection methods. Our method can handle large-scale and high-resolution data without being detected or mitigated by state-of-the-art data reconstruction defense methods. In contrast to baselines, our method can be directly applied to both FedAVG and FedSGD scenarios, underscoring the need for developers to devise new defenses against such vulnerabilities. We will open-source our code upon acceptance.

Related papers

DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective [59.66984417026933]
We introduce a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced for auditing)<n>We formulate two primary attack types: evasion attacks, designed to conceal the use of a dataset, and forgery attacks, intending to falsely implicate an unused dataset.<n>Building on the understanding of existing methods and attack objectives, we further propose systematic attack strategies: decoupling, removal, and detection for evasion; adversarial example-based methods for forgery.<n>Our benchmark, DATABench, comprises 17 evasion attacks, 5 forgery attacks, and 9
arXiv Detail & Related papers (2025-07-08T03:07:15Z)
Defending against Data Poisoning Attacks in Federated Learning via User Elimination [0.0]
This paper introduces a novel framework focused on the strategic elimination of adversarial users within a federated model. We detect anomalies in the aggregation phase of the Federated Algorithm, by integrating metadata gathered by the local training instances with Differential Privacy techniques. Our experiments demonstrate the efficacy of our methods, significantly mitigating the risk of data poisoning while maintaining user privacy and model performance.
arXiv Detail & Related papers (2024-04-19T10:36:00Z)
Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust. Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model. We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Gradient Leakage Defense with Key-Lock Module for Federated Learning [14.411227689702997]
Federated Learning (FL) is a widely adopted privacy-preserving machine learning approach. Recent findings reveal that privacy may be compromised and sensitive information potentially recovered from shared gradients. We propose a new gradient leakage defense technique that secures arbitrary model architectures using a private key-lock module.
arXiv Detail & Related papers (2023-05-06T16:47:52Z)
CrowdGuard: Federated Backdoor Detection in Federated Learning [39.58317527488534]
This paper presents a novel defense mechanism, CrowdGuard, that effectively mitigates backdoor attacks in Federated Learning. CrowdGuard employs a server-located stacked clustering scheme to enhance its resilience to rogue client feedback. The evaluation results demonstrate that CrowdGuard achieves a 100% True-Positive-Rate and True-Negative-Rate across various scenarios.
arXiv Detail & Related papers (2022-10-14T11:27:49Z)
Cluster-level pseudo-labelling for source-free cross-domain facial expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER) Our method exploits self-supervised pretraining to learn good feature representations from the target data. We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z)
FedDef: Defense Against Gradient Leakage in Federated Learning-based Network Intrusion Detection Systems [15.39058389031301]
We propose two privacy evaluation metrics designed for FL-based NIDSs. We propose FedDef, a novel optimization-based input perturbation defense strategy with theoretical guarantee. We experimentally evaluate four existing defenses on four datasets and show that our defense outperforms all the baselines in terms of privacy protection.
arXiv Detail & Related papers (2022-10-08T15:23:30Z)
Defense Against Gradient Leakage Attacks via Learning to Obscure Data [48.67836599050032]
Federated learning is considered as an effective privacy-preserving learning mechanism. In this paper, we propose a new defense method to protect the privacy of clients' data by learning to obscure data.
arXiv Detail & Related papers (2022-06-01T21:03:28Z)
Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection. First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results. We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z)
Gradient-based Data Subversion Attack Against Binary Classifiers [9.414651358362391]
In this work, we focus on label contamination attack in which an attacker poisons the labels of data to compromise the functionality of the system. We exploit the gradients of a differentiable convex loss function with respect to the predicted label as a warm-start and formulate different strategies to find a set of data instances to contaminate. Our experiments show that the proposed approach outperforms the baselines and is computationally efficient.
arXiv Detail & Related papers (2021-05-31T09:04:32Z)
Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process. The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z)
On Provable Backdoor Defense in Collaborative Learning [35.22450536986004]
Malicious users can upload data to prevent the model's convergence or inject hidden backdoors. Backdoor attacks are especially difficult to detect since the model behaves normally on standard test data but gives wrong outputs when triggered by certain backdoor keys. We propose a novel framework that generalizes existing subset aggregation methods.
arXiv Detail & Related papers (2021-01-19T14:39:32Z)
Towards Transferable Adversarial Attack against Deep Face Recognition [58.07786010689529]
Deep convolutional neural networks (DCNNs) have been found to be vulnerable to adversarial examples. transferable adversarial examples can severely hinder the robustness of DCNNs. We propose DFANet, a dropout-based method used in convolutional layers, which can increase the diversity of surrogate models. We generate a new set of adversarial face pairs that can successfully attack four commercial APIs without any queries.
arXiv Detail & Related papers (2020-04-13T06:44:33Z)
Stratified cross-validation for unbiased and privacy-preserving federated learning [0.0]
We focus on the recurrent problem of duplicated records that, if not handled properly, may cause over-optimistic estimations of a model's performances. We introduce and discuss stratified cross-validation, a validation methodology that leverages stratification techniques to prevent data leakage in federated learning settings.
arXiv Detail & Related papers (2020-01-22T15:49:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.