Related papers: On Provable Backdoor Defense in Collaborative Learning

On Provable Backdoor Defense in Collaborative Learning

URL: http://arxiv.org/abs/2101.08177v1
Date: Tue, 19 Jan 2021 14:39:32 GMT
Title: On Provable Backdoor Defense in Collaborative Learning
Authors: Ximing Qiao, Yuhua Bai, Siping Hu, Ang Li, Yiran Chen, Hai Li
Abstract summary: Malicious users can upload data to prevent the model's convergence or inject hidden backdoors. Backdoor attacks are especially difficult to detect since the model behaves normally on standard test data but gives wrong outputs when triggered by certain backdoor keys. We propose a novel framework that generalizes existing subset aggregation methods.
Score: 35.22450536986004
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As collaborative learning allows joint training of a model using multiple sources of data, the security problem has been a central concern. Malicious users can upload poisoned data to prevent the model's convergence or inject hidden backdoors. The so-called backdoor attacks are especially difficult to detect since the model behaves normally on standard test data but gives wrong outputs when triggered by certain backdoor keys. Although Byzantine-tolerant training algorithms provide convergence guarantee, provable defense against backdoor attacks remains largely unsolved. Methods based on randomized smoothing can only correct a small number of corrupted pixels or labels; methods based on subset aggregation cause a severe drop in classification accuracy due to low data utilization. We propose a novel framework that generalizes existing subset aggregation methods. The framework shows that the subset selection process, a deciding factor for subset aggregation methods, can be viewed as a code design problem. We derive the theoretical bound of data utilization ratio and provide optimal code construction. Experiments on non-IID versions of MNIST and CIFAR-10 show that our method with optimal codes significantly outperforms baselines using non-overlapping partition and random selection. Additionally, integration with existing coding theory results shows that special codes can track the location of the attackers. Such capability provides new countermeasures to backdoor attacks.

Related papers

Variance-Based Defense Against Blended Backdoor Attacks [0.0]
Backdoor attacks represent a subtle yet effective class of cyberattacks targeting AI models.<n>We propose a novel defense method that trains a model on the given dataset, detects poisoned classes, and extracts the critical part of the attack trigger.
arXiv Detail & Related papers (2025-06-02T09:01:35Z)
Lie Detector: Unified Backdoor Detection via Cross-Examination Framework [68.45399098884364]
We propose a unified backdoor detection framework in the semi-honest setting. Our method achieves superior detection performance, improving accuracy by 5.4%, 1.6%, and 11.9% over SoTA baselines. Notably, it is the first to effectively detect backdoors in multimodal large language models.
arXiv Detail & Related papers (2025-03-21T06:12:06Z)
Runtime Backdoor Detection for Federated Learning via Representational Dissimilarity Analysis [24.56608572464567]
Federated learning (FL) trains a shared model by aggregating model updates from distributed clients. The decoupling of model learning from local data makes FL highly vulnerable to backdoor attacks. We propose a novel approach to detecting malicious clients in an accurate, stable, and efficient manner.
arXiv Detail & Related papers (2025-03-06T14:23:18Z)
TERD: A Unified Framework for Safeguarding Diffusion Models Against Backdoors [36.07978634674072]
Diffusion models are vulnerable to backdoor attacks that compromise their integrity. We propose TERD, a backdoor defense framework that builds unified modeling for current attacks. TERD secures a 100% True Positive Rate (TPR) and True Negative Rate (TNR) across datasets of varying resolutions.
arXiv Detail & Related papers (2024-09-09T03:02:16Z)
BoBa: Boosting Backdoor Detection through Data Distribution Inference in Federated Learning [26.714674251814586]
Federated learning is susceptible to poisoning attacks due to its decentralized nature. We propose a novel distribution-aware anomaly detection mechanism, BoBa, to address this problem.
arXiv Detail & Related papers (2024-07-12T19:38:42Z)
IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency [20.61046457594186]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. This paper proposes a simple yet effective input-level backdoor detection (dubbed IBD-PSC) to filter out malicious testing images.
arXiv Detail & Related papers (2024-05-16T03:19:52Z)
Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks [63.269788236474234]
We propose to use model pairs on open-set classification tasks for detecting backdoors. We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures. This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature.
arXiv Detail & Related papers (2024-02-28T21:29:16Z)
Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation [120.42853706967188]
We explore the potential backdoor attacks on model adaptation launched by well-designed poisoning target data. We propose a plug-and-play method named MixAdapt, combining it with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z)
Mitigating Data Injection Attacks on Federated Learning [20.24380409762923]
Federated learning is a technique that allows multiple entities to collaboratively train models using their data. Despite its advantages, federated learning can be susceptible to false data injection attacks. We propose a novel technique to detect and mitigate data injection attacks on federated learning systems.
arXiv Detail & Related papers (2023-12-04T18:26:31Z)
Mitigating Backdoor Poisoning Attacks through the Lens of Spurious Correlation [43.75579468533781]
backdoors can be implanted through crafting training instances with a specific trigger and a target label. This paper posits that backdoor poisoning attacks exhibit emphspurious correlation between simple text features and classification labels. Our empirical study reveals that the malicious triggers are highly correlated to their target labels.
arXiv Detail & Related papers (2023-05-19T11:18:20Z)
Backdoor Learning on Sequence to Sequence Models [94.23904400441957]
In this paper, we study whether sequence-to-sequence (seq2seq) models are vulnerable to backdoor attacks. Specifically, we find by only injecting 0.2% samples of the dataset, we can cause the seq2seq model to generate the designated keyword and even the whole sentence. Extensive experiments on machine translation and text summarization have been conducted to show our proposed methods could achieve over 90% attack success rate on multiple datasets and models.
arXiv Detail & Related papers (2023-05-03T20:31:13Z)
CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning [63.72975421109622]
CleanCLIP is a finetuning framework that weakens the learned spurious associations introduced by backdoor attacks. CleanCLIP maintains model performance on benign examples while erasing a range of backdoor attacks on multimodal contrastive learning.
arXiv Detail & Related papers (2023-03-06T17:48:32Z)
FedCC: Robust Federated Learning against Model Poisoning Attacks [0.0]
Federated learning is a distributed framework designed to address privacy concerns. It introduces new attack surfaces, which are especially prone when data is non-Independently and Identically Distributed. We present FedCC, a simple yet effective novel defense algorithm against model poisoning attacks.
arXiv Detail & Related papers (2022-12-05T01:52:32Z)
Backdoor Attacks on Federated Learning with Lottery Ticket Hypothesis [49.38856542573576]
Edge devices in federated learning usually have much more limited computation and communication resources compared to servers in a data center. In this work, we empirically demonstrate that Lottery Ticket models are equally vulnerable to backdoor attacks as the original dense models.
arXiv Detail & Related papers (2021-09-22T04:19:59Z)
Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing [55.012801269326594]
In Byzantine robust distributed learning, a central server wants to train a machine learning model over data distributed across multiple workers. A fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages. We propose a simple bucketing scheme that adapts existing robust algorithms to heterogeneous datasets at a negligible computational cost.
arXiv Detail & Related papers (2020-06-16T17:58:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.