Related papers: Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency

Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency

URL: http://arxiv.org/abs/2403.09441v1
Date: Thu, 14 Mar 2024 14:34:25 GMT
Title: Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency
Authors: Hallgrimur Thorsteinsson, Valdemar J Henriksen, Tong Chen, Raghavendra Selvan,
Abstract summary: Adrial training has been presented as a mitigation strategy which can result in more robust models. We explore the effects of two different model compression methods -- structured weight pruning and quantization -- on adversarial robustness. We show that adversarial fine-tuning of compressed models can achieve robustness performance comparable to adversarially trained models.
Score: 3.3490724063380215
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As deep learning (DL) models are increasingly being integrated into our everyday lives, ensuring their safety by making them robust against adversarial attacks has become increasingly critical. DL models have been found to be susceptible to adversarial attacks which can be achieved by introducing small, targeted perturbations to disrupt the input data. Adversarial training has been presented as a mitigation strategy which can result in more robust models. This adversarial robustness comes with additional computational costs required to design adversarial attacks during training. The two objectives -- adversarial robustness and computational efficiency -- then appear to be in conflict of each other. In this work, we explore the effects of two different model compression methods -- structured weight pruning and quantization -- on adversarial robustness. We specifically explore the effects of fine-tuning on compressed models, and present the trade-off between standard fine-tuning and adversarial fine-tuning. Our results show that compression does not inherently lead to loss in model robustness and adversarial fine-tuning of a compressed model can yield large improvement to the robustness performance of models. We present experiments on two benchmark datasets showing that adversarial fine-tuning of compressed models can achieve robustness performance comparable to adversarially trained models, while also improving computational efficiency.

Related papers

A Robust Adversarial Ensemble with Causal (Feature Interaction) Interpretations for Image Classification [9.945272787814941]
We present a deep ensemble model that combines discriminative features with generative models to achieve both high accuracy and adversarial robustness. Our approach integrates a bottom-level pre-trained discriminative network for feature extraction with a top-level generative classification network that models adversarial input distributions.
arXiv Detail & Related papers (2024-12-28T05:06:20Z)
The Risk of Federated Learning to Skew Fine-Tuning Features and Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model. We introduce three robustness indicators and conduct experiments across diverse robust datasets. Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z)
JAB: Joint Adversarial Prompting and Belief Augmentation [81.39548637776365]
We introduce a joint framework in which we probe and improve the robustness of a black-box target model via adversarial prompting and belief augmentation. This framework utilizes an automated red teaming approach to probe the target model, along with a belief augmenter to generate instructions for the target model to improve its robustness to those adversarial probes.
arXiv Detail & Related papers (2023-11-16T00:35:54Z)
On the Trade-offs between Adversarial Robustness and Actionable Explanations [32.05150063480917]
We make one of the first attempts at studying the impact of adversarially robust models on actionable explanations. We derive theoretical bounds on the differences between the cost and the validity of recourses generated by state-of-the-art algorithms. Our results show that adversarially robust models significantly increase the cost and reduce the validity of the resulting recourses.
arXiv Detail & Related papers (2023-09-28T13:59:50Z)
Benchmarking Adversarial Robustness of Compressed Deep Learning Models [15.737988622271219]
This study seeks to understand the effect of adversarial inputs crafted for base models on their pruned versions. Our findings reveal that while the benefits of pruning enhanced generalizability, compression, and faster inference times are preserved, adversarial robustness remains comparable to the base model.
arXiv Detail & Related papers (2023-08-16T06:06:56Z)
Robust Trajectory Prediction against Adversarial Attacks [84.10405251683713]
Trajectory prediction using deep neural networks (DNNs) is an essential component of autonomous driving systems. These methods are vulnerable to adversarial attacks, leading to serious consequences such as collisions. In this work, we identify two key ingredients to defend trajectory prediction models against adversarial attacks.
arXiv Detail & Related papers (2022-07-29T22:35:05Z)
Can collaborative learning be private, robust and scalable? [6.667150890634173]
We investigate the effectiveness of combining differential privacy, model compression and adversarial training to improve the robustness of models against adversarial samples in train- and inference-time attacks. Our investigation provides a practical overview of various methods that allow one to achieve a competitive model performance, a significant reduction in model's size and an improved empirical adversarial robustness without a severe performance degradation.
arXiv Detail & Related papers (2022-05-05T13:51:44Z)
Adversarial Fine-tune with Dynamically Regulated Adversary [27.034257769448914]
In many real-world applications such as health diagnosis and autonomous surgical robotics, the standard performance is more valued over model robustness against such extremely malicious attacks. This work proposes a simple yet effective transfer learning-based adversarial training strategy that disentangles the negative effects of adversarial samples on model's standard performance. In addition, we introduce a training-friendly adversarial attack algorithm, which facilitates the boost of adversarial robustness without introducing significant training complexity.
arXiv Detail & Related papers (2022-04-28T00:07:15Z)
Self-Ensemble Adversarial Training for Improved Robustness [14.244311026737666]
Adversarial training is the strongest strategy against various adversarial attacks among all sorts of defense methods. Recent works mainly focus on developing new loss functions or regularizers, attempting to find the unique optimal point in the weight space. We devise a simple but powerful emphSelf-Ensemble Adversarial Training (SEAT) method for yielding a robust classifier by averaging weights of history models.
arXiv Detail & Related papers (2022-03-18T01:12:18Z)
What do Compressed Large Language Models Forget? Robustness Challenges in Model Compression [68.82486784654817]
We study two popular model compression techniques including knowledge distillation and pruning. We show that compressed models are significantly less robust than their PLM counterparts on adversarial test sets. We develop a regularization strategy for model compression based on sample uncertainty.
arXiv Detail & Related papers (2021-10-16T00:20:04Z)
Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise. Pre-processing methods may suffer from the robustness degradation effect. A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model. We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z)
Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness [97.67477497115163]
We use mode connectivity to study the adversarial robustness of deep neural networks. Our experiments cover various types of adversarial attacks applied to different network architectures and datasets. Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.
arXiv Detail & Related papers (2020-04-30T19:12:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.