QuantAttack: Exploiting Dynamic Quantization to Attack Vision
Transformers
- URL: http://arxiv.org/abs/2312.02220v1
- Date: Sun, 3 Dec 2023 18:31:19 GMT
- Title: QuantAttack: Exploiting Dynamic Quantization to Attack Vision
Transformers
- Authors: Amit Baras, Alon Zolfi, Yuval Elovici, Asaf Shabtai
- Abstract summary: We present QuantAttack, a novel attack that targets the availability of quantized models.
We show that carefully crafted adversarial examples, which are designed to exhaust the resources of the operating system, can trigger worst-case performance.
- Score: 29.957089564635083
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, there has been a significant trend in deep neural networks
(DNNs), particularly transformer-based models, of developing ever-larger and
more capable models. While they demonstrate state-of-the-art performance, their
growing scale requires increased computational resources (e.g., GPUs with
greater memory capacity). To address this problem, quantization techniques
(i.e., low-bit-precision representation and matrix multiplication) have been
proposed. Most quantization techniques employ a static strategy in which the
model parameters are quantized, either during training or inference, without
considering the test-time sample. In contrast, dynamic quantization techniques,
which have become increasingly popular, adapt during inference based on the
input provided, while maintaining full-precision performance. However, their
dynamic behavior and average-case performance assumption makes them vulnerable
to a novel threat vector -- adversarial attacks that target the model's
efficiency and availability. In this paper, we present QuantAttack, a novel
attack that targets the availability of quantized models, slowing down the
inference, and increasing memory usage and energy consumption. We show that
carefully crafted adversarial examples, which are designed to exhaust the
resources of the operating system, can trigger worst-case performance. In our
experiments, we demonstrate the effectiveness of our attack on vision
transformers on a wide range of tasks, both uni-modal and multi-modal. We also
examine the effect of different attack variants (e.g., a universal
perturbation) and the transferability between different models.
Related papers
- The Impact of Quantization on the Robustness of Transformer-based Text
Classifiers [5.281054432963503]
This work is the first application of quantization on the robustness of NLP models.
We evaluate the impact of quantization on BERT and DistilBERT models in text classification using SST-2, Emotion, and MR datasets.
Our experiments indicate that quantization increases the robustness of the model by 18.80% on average compared to adversarial training.
arXiv Detail & Related papers (2024-03-08T14:55:05Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Improving the Robustness of Transformer-based Large Language Models with
Dynamic Attention [43.95101492654236]
Transformer-based models, such as BERT and GPT, have been widely adopted in natural language processing (NLP)
Recent studies show their vulnerability to textual adversarial attacks where the model's output can be misled by intentionally manipulating the text inputs.
We propose a novel method called dynamic attention, tailored for the transformer architecture, to enhance the inherent robustness of the model itself against various adversarial attacks.
arXiv Detail & Related papers (2023-11-29T07:09:13Z) - Uncovering the Hidden Cost of Model Compression [43.62624133952414]
Visual Prompting has emerged as a pivotal method for transfer learning in computer vision.
Model compression detrimentally impacts the performance of visual prompting-based transfer.
However, negative effects on calibration are not present when models are compressed via quantization.
arXiv Detail & Related papers (2023-08-29T01:47:49Z) - Common Knowledge Learning for Generating Transferable Adversarial
Examples [60.1287733223249]
This paper focuses on an important type of black-box attacks, where the adversary generates adversarial examples by a substitute (source) model.
Existing methods tend to give unsatisfactory adversarial transferability when the source and target models are from different types of DNN architectures.
We propose a common knowledge learning (CKL) framework to learn better network weights to generate adversarial examples.
arXiv Detail & Related papers (2023-07-01T09:07:12Z) - Quantizable Transformers: Removing Outliers by Helping Attention Heads
Do Nothing [18.673619610942197]
Modern transformer models tend to learn strong outliers in their activations, making them difficult to quantize.
We show that strong outliers are related to very specific behavior of attention heads that try to learn a "no-op" or just a partial update of the residual.
We propose two simple (independent) modifications to the attention mechanism - clipped softmax and gated attention.
arXiv Detail & Related papers (2023-06-22T14:39:04Z) - Temporal Dynamic Quantization for Diffusion Models [18.184163233551292]
We introduce a novel quantization method that dynamically adjusts the quantization interval based on time step information.
Unlike conventional dynamic quantization techniques, our approach has no computational overhead during inference.
Our experiments demonstrate substantial improvements in output quality with the quantized diffusion model across various datasets.
arXiv Detail & Related papers (2023-06-04T09:49:43Z) - Frequency Domain Model Augmentation for Adversarial Attack [91.36850162147678]
For black-box attacks, the gap between the substitute model and the victim model is usually large.
We propose a novel spectrum simulation attack to craft more transferable adversarial examples against both normally trained and defense models.
arXiv Detail & Related papers (2022-07-12T08:26:21Z) - Defending Variational Autoencoders from Adversarial Attacks with MCMC [74.36233246536459]
Variational autoencoders (VAEs) are deep generative models used in various domains.
As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input.
Here, we examine several objective functions for adversarial attacks construction, suggest metrics assess the model robustness, and propose a solution.
arXiv Detail & Related papers (2022-03-18T13:25:18Z) - Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models.
Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely.
Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z) - Post-Training Quantization for Vision Transformer [85.57953732941101]
We present an effective post-training quantization algorithm for reducing the memory storage and computational costs of vision transformers.
We can obtain an 81.29% top-1 accuracy using DeiT-B model on ImageNet dataset with about 8-bit quantization.
arXiv Detail & Related papers (2021-06-27T06:27:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.