Object-oriented backdoor attack against image captioning
- URL: http://arxiv.org/abs/2401.02600v1
- Date: Fri, 5 Jan 2024 01:52:13 GMT
- Title: Object-oriented backdoor attack against image captioning
- Authors: Meiling Li, Nan Zhong, Xinpeng Zhang, Zhenxing Qian, Sheng Li
- Abstract summary: Backdoor attack against image classification task has been widely studied and proven to be successful.
In this paper, we explore backdoor attack towards image captioning models by poisoning training data.
Our method proves the weakness of image captioning models to backdoor attack and we hope this work can raise the awareness of defending against backdoor attack in the image captioning field.
- Score: 40.5688859498834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor attack against image classification task has been widely studied and
proven to be successful, while there exist little research on the backdoor
attack against vision-language models. In this paper, we explore backdoor
attack towards image captioning models by poisoning training data. Assuming the
attacker has total access to the training dataset, and cannot intervene in
model construction or training process. Specifically, a portion of benign
training samples is randomly selected to be poisoned. Afterwards, considering
that the captions are usually unfolded around objects in an image, we design an
object-oriented method to craft poisons, which aims to modify pixel values by a
slight range with the modification number proportional to the scale of the
current detected object region. After training with the poisoned data, the
attacked model behaves normally on benign images, but for poisoned images, the
model will generate some sentences irrelevant to the given image. The attack
controls the model behavior on specific test images without sacrificing the
generation performance on benign test images. Our method proves the weakness of
image captioning models to backdoor attack and we hope this work can raise the
awareness of defending against backdoor attack in the image captioning field.
Related papers
- Stealthy Targeted Backdoor Attacks against Image Captioning [16.409633596670368]
We present a novel method to craft targeted backdoor attacks against image caption models.
Our method first learns a special trigger by leveraging universal perturbation techniques for object detection.
Our approach can achieve a high attack success rate while having a negligible impact on model clean performance.
arXiv Detail & Related papers (2024-06-09T18:11:06Z) - Clean-image Backdoor Attacks [34.051173092777844]
We propose clean-image backdoor attacks which uncover that backdoors can still be injected via a fraction of incorrect labels.
In our attacks, the attacker first seeks a trigger feature to divide the training images into two parts.
The backdoor will be finally implanted into the target model after it is trained on the poisoned data.
arXiv Detail & Related papers (2024-03-22T07:47:13Z) - Mask and Restore: Blind Backdoor Defense at Test Time with Masked
Autoencoder [57.739693628523]
We propose a framework for blind backdoor defense with Masked AutoEncoder (BDMAE)
BDMAE detects possible triggers in the token space using image structural similarity and label consistency between the test image and MAE restorations.
Our approach is blind to the model restorations, trigger patterns and image benignity.
arXiv Detail & Related papers (2023-03-27T19:23:33Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation [48.238349062995916]
We find that highly effective backdoors can be easily inserted using rotation-based image transformation.
Our work highlights a new, simple, physically realizable, and highly effective vector for backdoor attacks.
arXiv Detail & Related papers (2022-07-22T00:21:18Z) - Backdoor Attacks on Self-Supervised Learning [22.24046752858929]
We show that self-supervised learning methods are vulnerable to backdoor attacks.
An attacker poisons a part of the unlabeled data by adding a small trigger (known to the attacker) to the images.
We propose a knowledge distillation based defense algorithm that succeeds in neutralizing the attack.
arXiv Detail & Related papers (2021-05-21T04:22:05Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z) - Reverse Engineering Imperceptible Backdoor Attacks on Deep Neural
Networks for Detection and Training Set Cleansing [22.22337220509128]
Backdoor data poisoning is an emerging form of adversarial attack against deep neural network image classifiers.
In this paper, we make a break-through in defending backdoor attacks with imperceptible backdoor patterns.
We propose an optimization-based reverse-engineering defense, that jointly: 1) detects whether the training set is poisoned; 2) if so, identifies the target class and the training images with the backdoor pattern embedded; and 3) additionally, reversely engineers an estimate of the backdoor pattern used by the attacker.
arXiv Detail & Related papers (2020-10-15T03:12:24Z) - Clean-Label Backdoor Attacks on Video Recognition Models [87.46539956587908]
We show that image backdoor attacks are far less effective on videos.
We propose the use of a universal adversarial trigger as the backdoor trigger to attack video recognition models.
Our proposed backdoor attack is resistant to state-of-the-art backdoor defense/detection methods.
arXiv Detail & Related papers (2020-03-06T04:51:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.