Related papers: CIARD: Cyclic Iterative Adversarial Robustness Distillation

CIARD: Cyclic Iterative Adversarial Robustness Distillation

URL: http://arxiv.org/abs/2509.12633v1
Date: Tue, 16 Sep 2025 03:51:43 GMT
Title: CIARD: Cyclic Iterative Adversarial Robustness Distillation
Authors: Liming Lu, Shuchao Pang, Xu Zheng, Xiang Gu, Anan Du, Yunhuai Liu, Yongbin Zhou,
Abstract summary: Adrial robustness distillation (ARD) aims to transfer performance and robustness from teacher model to student model.<n>Existing ARD approaches enhance student model's robustness, but the inevitable by-product leads to degraded performance on clean examples.<n>We propose a novel Cyclic Iterative ARD (CIARD) method with two key innovations.
Score: 19.685981220232712
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial robustness distillation (ARD) aims to transfer both performance and robustness from teacher model to lightweight student model, enabling resilient performance on resource-constrained scenarios. Though existing ARD approaches enhance student model's robustness, the inevitable by-product leads to the degraded performance on clean examples. We summarize the causes of this problem inherent in existing methods with dual-teacher framework as: 1. The divergent optimization objectives of dual-teacher models, i.e., the clean and robust teachers, impede effective knowledge transfer to the student model, and 2. The iteratively generated adversarial examples during training lead to performance deterioration of the robust teacher model. To address these challenges, we propose a novel Cyclic Iterative ARD (CIARD) method with two key innovations: a. A multi-teacher framework with contrastive push-loss alignment to resolve conflicts in dual-teacher optimization objectives, and b. Continuous adversarial retraining to maintain dynamic teacher robustness against performance degradation from the varying adversarial examples. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that CIARD achieves remarkable performance with an average 3.53 improvement in adversarial defense rates across various attack scenarios and a 5.87 increase in clean sample accuracy, establishing a new benchmark for balancing model robustness and generalization. Our code is available at https://github.com/eminentgu/CIARD

Related papers

MMT-ARD: Multimodal Multi-Teacher Adversarial Distillation for Robust Vision-Language Models [123.90007730845876]
We propose MMT-ARD: a Multimodal Multi-Teacher Adversarial Distillation framework.<n>Our key innovation is a dual-teacher knowledge fusion architecture that collaboratively optimize clean feature preservation and robust feature enhancement.<n>Experiments on ImageNet and zero-shot benchmarks demonstrate that MMT-ARD improves robust accuracy by +4.32% and zero-shot accuracy by +3.5%.
arXiv Detail & Related papers (2025-11-21T17:46:44Z)
DARD: Dice Adversarial Robustness Distillation against Adversarial Attacks [12.90150211072263]
We introduce Dice Adversarial Robustness Distillation (DARD), a novel method designed to transfer robustness through a tailored knowledge distillation paradigm.<n>Our experiments demonstrate that the DARD approach consistently outperforms adversarially trained networks with the same architecture.
arXiv Detail & Related papers (2025-09-15T02:31:30Z)
Teach Me to Trick: Exploring Adversarial Transferability via Knowledge Distillation [0.0]
knowledge distillation can enhance the generation of transferable adversarial examples.<n>A lightweight student model is trained using two KD strategies: curriculum-based switching and joint optimization.<n>Student models distilled from multiple teachers achieve attack success rates comparable to ensemble-based baselines.
arXiv Detail & Related papers (2025-07-29T16:43:54Z)
Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models [62.5501109475725]
Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them. This paper introduces Online Knowledge Distillation (OKD), where the teacher network integrates small online modules to concurrently train with the student model. OKD achieves or exceeds the performance of leading methods in various model architectures and sizes, reducing training time by up to fourfold.
arXiv Detail & Related papers (2024-09-19T07:05:26Z)
Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge [17.382306203152943]
Dynamic Guidance Adversarial Distillation (DGAD) framework tackles the challenge of differential sample importance. DGAD employs Misclassification-Aware Partitioning (MAP) to dynamically tailor the distillation focus. Error-corrective Label Swapping (ELS) corrects misclassifications of the teacher on both clean and adversarially perturbed inputs.
arXiv Detail & Related papers (2024-09-03T05:52:37Z)
Adversarial Sparse Teacher: Defense Against Distillation-Based Model Stealing Attacks Using Adversarial Examples [2.0257616108612373]
Adversarial Sparse Teacher (AST) is a robust defense method against distillation-based model stealing attacks. Our approach trains a teacher model using adversarial examples to produce sparse logit responses and increase the entropy of the output distribution.
arXiv Detail & Related papers (2024-03-08T09:43:27Z)
DistiLLM: Towards Streamlined Distillation for Large Language Models [53.46759297929675]
DistiLLM is a more effective and efficient KD framework for auto-regressive language models. DisiLLM comprises two components: (1) a novel skew Kullback-Leibler divergence loss, where we unveil and leverage its theoretical properties, and (2) an adaptive off-policy approach designed to enhance the efficiency in utilizing student-generated outputs.
arXiv Detail & Related papers (2024-02-06T11:10:35Z)
Learn from the Past: A Proxy Guided Adversarial Defense Framework with Self Distillation Regularization [53.04697800214848]
Adversarial Training (AT) is pivotal in fortifying the robustness of deep learning models. AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting. We present a general proxy guided defense framework, LAST' (bf Learn from the Pbf ast)
arXiv Detail & Related papers (2023-10-19T13:13:41Z)
Adversarial Contrastive Distillation with Adaptive Denoising [15.119013995045192]
We propose Contrastive Relationship DeNoise Distillation (CRDND) to boost the robustness of small models. We show CRDND can transfer robust knowledge efficiently and achieves state-of-the-art performances.
arXiv Detail & Related papers (2023-02-17T09:00:18Z)
Distantly-Supervised Named Entity Recognition with Adaptive Teacher Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models. In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks. Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z)
Alleviating Robust Overfitting of Adversarial Training With Consistency Regularization [9.686724616328874]
Adversarial training (AT) has proven to be one of the most effective ways to defend Deep Neural Networks (DNNs) against adversarial attacks. robustness will drop sharply at a certain stage, always exists during AT. consistency regularization, a popular technique in semi-supervised learning, has a similar goal as AT and can be used to alleviate robust overfitting.
arXiv Detail & Related papers (2022-05-24T03:18:43Z)
How and When Adversarial Robustness Transfers in Knowledge Distillation? [137.11016173468457]
This paper studies how and when the adversarial robustness can be transferred from a teacher model to a student model in Knowledge distillation (KD) We show that standard KD training fails to preserve adversarial robustness, and we propose KD with input gradient alignment (KDIGA) for remedy. Under certain assumptions, we prove that the student model using our proposed KDIGA can achieve at least the same certified robustness as the teacher model.
arXiv Detail & Related papers (2021-10-22T21:30:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.