Related papers: DARD: Dice Adversarial Robustness Distillation against Adversarial Attacks

DARD: Dice Adversarial Robustness Distillation against Adversarial Attacks

URL: http://arxiv.org/abs/2509.11525v1
Date: Mon, 15 Sep 2025 02:31:30 GMT
Title: DARD: Dice Adversarial Robustness Distillation against Adversarial Attacks
Authors: Jing Zou, Shungeng Zhang, Meikang Qiu, Chong Li,
Abstract summary: We introduce Dice Adversarial Robustness Distillation (DARD), a novel method designed to transfer robustness through a tailored knowledge distillation paradigm.<n>Our experiments demonstrate that the DARD approach consistently outperforms adversarially trained networks with the same architecture.
Score: 12.90150211072263
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning models are vulnerable to adversarial examples, posing critical security challenges in real-world applications. While Adversarial Training (AT ) is a widely adopted defense mechanism to enhance robustness, it often incurs a trade-off by degrading performance on unperturbed, natural data. Recent efforts have highlighted that larger models exhibit enhanced robustness over their smaller counterparts. In this paper, we empirically demonstrate that such robustness can be systematically distilled from large teacher models into compact student models. To achieve better performance, we introduce Dice Adversarial Robustness Distillation (DARD), a novel method designed to transfer robustness through a tailored knowledge distillation paradigm. Additionally, we propose Dice Projected Gradient Descent (DPGD), an adversarial example generalization method optimized for effective attack. Our extensive experiments demonstrate that the DARD approach consistently outperforms adversarially trained networks with the same architecture, achieving superior robustness and standard accuracy.

Related papers

Sustainable Self-evolution Adversarial Training [51.25767996364584]
We propose a Sustainable Self-Evolution Adversarial Training (SSEAT) framework for adversarial training defense models.<n>We introduce a continual adversarial defense pipeline to realize learning from various kinds of adversarial examples.<n>We also propose an adversarial data replay module to better select more diverse and key relearning data.
arXiv Detail & Related papers (2024-12-03T08:41:11Z)
Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge [17.382306203152943]
Dynamic Guidance Adversarial Distillation (DGAD) framework tackles the challenge of differential sample importance. DGAD employs Misclassification-Aware Partitioning (MAP) to dynamically tailor the distillation focus. Error-corrective Label Swapping (ELS) corrects misclassifications of the teacher on both clean and adversarially perturbed inputs.
arXiv Detail & Related papers (2024-09-03T05:52:37Z)
Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents. This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs. In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z)
Learn from the Past: A Proxy Guided Adversarial Defense Framework with Self Distillation Regularization [53.04697800214848]
Adversarial Training (AT) is pivotal in fortifying the robustness of deep learning models. AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting. We present a general proxy guided defense framework, LAST' (bf Learn from the Pbf ast)
arXiv Detail & Related papers (2023-10-19T13:13:41Z)
Annealing Self-Distillation Rectification Improves Adversarial Training [0.10241134756773226]
We analyze the characteristics of robust models and identify that robust models tend to produce smoother and well-calibrated outputs. We propose Annealing Self-Distillation Rectification, which generates soft labels as a better guidance mechanism. We demonstrate the efficacy of ADR through extensive experiments and strong performances across datasets.
arXiv Detail & Related papers (2023-05-20T06:35:43Z)
Adversarial Contrastive Distillation with Adaptive Denoising [15.119013995045192]
We propose Contrastive Relationship DeNoise Distillation (CRDND) to boost the robustness of small models. We show CRDND can transfer robust knowledge efficiently and achieves state-of-the-art performances.
arXiv Detail & Related papers (2023-02-17T09:00:18Z)
Adversarial Fine-tune with Dynamically Regulated Adversary [27.034257769448914]
In many real-world applications such as health diagnosis and autonomous surgical robotics, the standard performance is more valued over model robustness against such extremely malicious attacks. This work proposes a simple yet effective transfer learning-based adversarial training strategy that disentangles the negative effects of adversarial samples on model's standard performance. In addition, we introduce a training-friendly adversarial attack algorithm, which facilitates the boost of adversarial robustness without introducing significant training complexity.
arXiv Detail & Related papers (2022-04-28T00:07:15Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better [66.69777970159558]
We propose a novel adversarial robustness distillation method called Robust Soft Label Adversarial Distillation (RSLAD) RSLAD fully exploits the robust soft labels produced by a robust (adversarially-trained) large teacher model to guide the student's learning. We empirically demonstrate the effectiveness of our RSLAD approach over existing adversarial training and distillation methods.
arXiv Detail & Related papers (2021-08-18T04:32:35Z)
Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications. We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths. Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z)
Boosting Adversarial Training with Hypersphere Embedding [53.75693100495097]
Adversarial training is one of the most effective defenses against adversarial attacks for deep learning models. In this work, we advocate incorporating the hypersphere embedding mechanism into the AT procedure. We validate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2020-02-20T08:42:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.