Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method
Perspective
- URL: http://arxiv.org/abs/2311.16646v1
- Date: Tue, 28 Nov 2023 09:53:05 GMT
- Title: Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method
Perspective
- Authors: Ming-Yu Chung, Sheng-Yen Chou, Chia-Mu Yu, Pin-Yu Chen, Sy-Yen Kuo,
Tsung-Yi Ho
- Abstract summary: We introduce two new theory-driven trigger pattern generation methods specialized for dataset distillation.
We show that our optimization-based trigger design framework informs effective backdoor attacks on dataset distillation.
- Score: 65.70799289211868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dataset distillation offers a potential means to enhance data efficiency in
deep learning. Recent studies have shown its ability to counteract backdoor
risks present in original training samples. In this study, we delve into the
theoretical aspects of backdoor attacks and dataset distillation based on
kernel methods. We introduce two new theory-driven trigger pattern generation
methods specialized for dataset distillation. Following a comprehensive set of
analyses and experiments, we show that our optimization-based trigger design
framework informs effective backdoor attacks on dataset distillation. Notably,
datasets poisoned by our designed trigger prove resilient against conventional
backdoor attack detection and mitigation methods. Our empirical results
validate that the triggers developed using our approaches are proficient at
executing resilient backdoor attacks.
Related papers
- Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations [50.1394620328318]
Existing backdoor attacks mainly focus on balanced datasets.
We propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$2$AO)
Our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.
arXiv Detail & Related papers (2024-10-16T18:44:22Z) - Mellivora Capensis: A Backdoor-Free Training Framework on the Poisoned Dataset without Auxiliary Data [29.842087372804905]
This paper addresses the challenges of backdoor attack countermeasures in real-world scenarios.
We propose a robust and clean-data-free backdoor defense framework, namely Mellivora Capensis (textttMeCa), which enables the model trainer to train a clean model on the poisoned dataset.
arXiv Detail & Related papers (2024-05-21T12:20:19Z) - SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources.
Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker.
Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z) - A Comprehensive Study on Dataset Distillation: Performance, Privacy,
Robustness and Fairness [8.432686179800543]
We conduct extensive experiments to evaluate current state-of-the-art dataset distillation methods.
We successfully use membership inference attacks to show that privacy risks still remain.
This work offers a large-scale benchmarking framework for dataset distillation evaluation.
arXiv Detail & Related papers (2023-05-05T08:19:27Z) - INK: Inheritable Natural Backdoor Attack Against Model Distillation [8.937026844871074]
We introduce INK, an inheritable natural backdoor attack that targets model distillation.
INK employs image variance as a backdoor trigger and enables both clean-image and clean-label attacks.
For instance, INK maintains an attack success rate of over 98% post-distillation, compared to an average success rate of 1.4% for existing methods.
arXiv Detail & Related papers (2023-04-21T14:35:47Z) - Backdoor Attacks Against Dataset Distillation [24.39067295054253]
This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain.
We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING.
Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases.
arXiv Detail & Related papers (2023-01-03T16:58:34Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z) - Curse or Redemption? How Data Heterogeneity Affects the Robustness of
Federated Learning [51.15273664903583]
Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks.
This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks.
arXiv Detail & Related papers (2021-02-01T06:06:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.