Backdoor Attacks Against Dataset Distillation
- URL: http://arxiv.org/abs/2301.01197v1
- Date: Tue, 3 Jan 2023 16:58:34 GMT
- Title: Backdoor Attacks Against Dataset Distillation
- Authors: Yugeng Liu, Zheng Li, Michael Backes, Yun Shen, Yang Zhang
- Abstract summary: This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain.
We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING.
Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases.
- Score: 24.39067295054253
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dataset distillation has emerged as a prominent technique to improve data
efficiency when training machine learning models. It encapsulates the knowledge
from a large dataset into a smaller synthetic dataset. A model trained on this
smaller distilled dataset can attain comparable performance to a model trained
on the original training dataset. However, the existing dataset distillation
techniques mainly aim at achieving the best trade-off between resource usage
efficiency and model utility. The security risks stemming from them have not
been explored. This study performs the first backdoor attack against the models
trained on the data distilled by dataset distillation models in the image
domain. Concretely, we inject triggers into the synthetic data during the
distillation procedure rather than during the model training stage, where all
previous attacks are performed. We propose two types of backdoor attacks,
namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw
data at the initial distillation phase, while DOORPING iteratively updates the
triggers during the entire distillation procedure. We conduct extensive
evaluations on multiple datasets, architectures, and dataset distillation
techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack
success rate (ASR) scores in some cases, while DOORPING reaches higher ASR
scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive
ablation study to analyze the factors that may affect the attack performance.
Finally, we evaluate multiple defense mechanisms against our backdoor attacks
and show that our attacks can practically circumvent these defense mechanisms.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.