Test-Time Backdoor Attacks on Multimodal Large Language Models
- URL: http://arxiv.org/abs/2402.08577v1
- Date: Tue, 13 Feb 2024 16:28:28 GMT
- Title: Test-Time Backdoor Attacks on Multimodal Large Language Models
- Authors: Dong Lu, Tianyu Pang, Chao Du, Qian Liu, Xianjun Yang, Min Lin
- Abstract summary: We present AnyDoor, a test-time backdoor attack against multimodal large language models (MLLMs)
AnyDoor employs similar techniques used in universal adversarial attacks, but distinguishes itself by its ability to decouple the timing of setup and activation of harmful effects.
- Score: 41.601029747738394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor attacks are commonly executed by contaminating training data, such
that a trigger can activate predetermined harmful effects during the test
phase. In this work, we present AnyDoor, a test-time backdoor attack against
multimodal large language models (MLLMs), which involves injecting the backdoor
into the textual modality using adversarial test images (sharing the same
universal perturbation), without requiring access to or modification of the
training data. AnyDoor employs similar techniques used in universal adversarial
attacks, but distinguishes itself by its ability to decouple the timing of
setup and activation of harmful effects. In our experiments, we validate the
effectiveness of AnyDoor against popular MLLMs such as LLaVA-1.5, MiniGPT-4,
InstructBLIP, and BLIP-2, as well as provide comprehensive ablation studies.
Notably, because the backdoor is injected by a universal perturbation, AnyDoor
can dynamically change its backdoor trigger prompts/harmful effects, exposing a
new challenge for defending against backdoor attacks. Our project page is
available at https://sail-sg.github.io/AnyDoor/.
Related papers
- NoiseAttack: An Evasive Sample-Specific Multi-Targeted Backdoor Attack Through White Gaussian Noise [0.19820694575112383]
Backdoor attacks pose a significant threat when using third-party data for deep learning development.
We introduce a novel sample-specific multi-targeted backdoor attack, namely NoiseAttack.
This work is the first of its kind to launch a vision backdoor attack with the intent to generate multiple targeted classes.
arXiv Detail & Related papers (2024-09-03T19:24:46Z) - LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning [49.174341192722615]
Backdoor attack poses a significant security threat to Deep Learning applications.
Recent papers have introduced attacks using sample-specific invisible triggers crafted through special transformation functions.
We introduce a novel backdoor attack LOTUS to address both evasiveness and resilience.
arXiv Detail & Related papers (2024-03-25T21:01:29Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns.
One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z) - MM-BD: Post-Training Detection of Backdoor Attacks with Arbitrary
Backdoor Pattern Types Using a Maximum Margin Statistic [27.62279831135902]
We propose a post-training defense that detects backdoor attacks with arbitrary types of backdoor embeddings.
Our detector does not need any legitimate clean samples, and can efficiently detect backdoor attacks with arbitrary numbers of source classes.
arXiv Detail & Related papers (2022-05-13T21:32:24Z) - Check Your Other Door! Establishing Backdoor Attacks in the Frequency
Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks.
We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z) - Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger [48.59965356276387]
We propose to use syntactic structure as the trigger in textual backdoor attacks.
We conduct extensive experiments to demonstrate that the trigger-based attack method can achieve comparable attack performance.
These results also reveal the significant insidiousness and harmfulness of textual backdoor attacks.
arXiv Detail & Related papers (2021-05-26T08:54:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.