Related papers: BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation

BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation

URL: http://arxiv.org/abs/2504.16907v1
Date: Wed, 23 Apr 2025 17:34:48 GMT
Title: BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation
Authors: Ruotong Wang, Mingli Zhu, Jiarong Ou, Rui Chen, Xin Tao, Pengfei Wan, Baoyuan Wu,
Abstract summary: Text-to-video (T2V) generative models have rapidly advanced and found widespread applications across fields like entertainment, education, and marketing.<n>We observe that in T2V generation tasks, the generated videos often contain substantial redundant information not explicitly specified in the text prompts.<n>We introduce BadVideo, the first backdoor attack framework tailored for T2V generation.
Score: 37.055665794706336
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-to-video (T2V) generative models have rapidly advanced and found widespread applications across fields like entertainment, education, and marketing. However, the adversarial vulnerabilities of these models remain rarely explored. We observe that in T2V generation tasks, the generated videos often contain substantial redundant information not explicitly specified in the text prompts, such as environmental elements, secondary objects, and additional details, providing opportunities for malicious attackers to embed hidden harmful content. Exploiting this inherent redundancy, we introduce BadVideo, the first backdoor attack framework tailored for T2V generation. Our attack focuses on designing target adversarial outputs through two key strategies: (1) Spatio-Temporal Composition, which combines different spatiotemporal features to encode malicious information; (2) Dynamic Element Transformation, which introduces transformations in redundant elements over time to convey malicious information. Based on these strategies, the attacker's malicious target seamlessly integrates with the user's textual instructions, providing high stealthiness. Moreover, by exploiting the temporal dimension of videos, our attack successfully evades traditional content moderation systems that primarily analyze spatial information within individual frames. Extensive experiments demonstrate that BadVideo achieves high attack success rates while preserving original semantics and maintaining excellent performance on clean inputs. Overall, our work reveals the adversarial vulnerability of T2V models, calling attention to potential risks and misuse. Our project page is at https://wrt2000.github.io/BadVideo2025/.

Related papers

GenBreak: Red Teaming Text-to-Image Generators Using Large Language Models [65.91565607573786]
Text-to-image (T2I) models can be misused to generate harmful content, including nudity or violence.<n>Recent research on red-teaming and adversarial attacks against T2I models has notable limitations.<n>We propose GenBreak, a framework that fine-tunes a red-team large language model (LLM) to systematically explore underlying vulnerabilities.
arXiv Detail & Related papers (2025-06-11T09:09:12Z)
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models [88.63040835652902]
Text to video models are vulnerable to jailbreak attacks, where specially crafted prompts bypass safety mechanisms and lead to the generation of harmful or unsafe content.<n>We propose T2VShield, a comprehensive and model agnostic defense framework designed to protect text to video models from jailbreak threats.<n>Our method systematically analyzes the input, model, and output stages to identify the limitations of existing defenses.
arXiv Detail & Related papers (2025-04-22T01:18:42Z)
Tit-for-Tat: Safeguarding Large Vision-Language Models Against Jailbreak Attacks via Adversarial Defense [90.71884758066042]
Large vision-language models (LVLMs) introduce a unique vulnerability: susceptibility to malicious attacks via visual inputs.<n>We propose ESIII (Embedding Security Instructions Into Images), a novel methodology for transforming the visual space from a source of vulnerability into an active defense mechanism.
arXiv Detail & Related papers (2025-03-14T17:39:45Z)
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models [39.15695612766001]
We introduce T2VSafetyBench, a new benchmark for safety-critical assessments of text-to-video models. We define 12 critical aspects of video generation safety and construct a malicious prompt dataset. No single model excels in all aspects, with different models showing various strengths. There is a trade-off between the usability and safety of text-to-video generative models.
arXiv Detail & Related papers (2024-07-08T14:04:58Z)
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt [60.54666043358946]
This paper introduces the Bi-Modal Adversarial Prompt Attack (BAP), which executes jailbreaks by optimizing textual and visual prompts cohesively. In particular, we utilize a large language model to analyze jailbreak failures and employ chain-of-thought reasoning to refine textual prompts.
arXiv Detail & Related papers (2024-06-06T13:00:42Z)
SVASTIN: Sparse Video Adversarial Attack via Spatio-Temporal Invertible Neural Networks [14.87613382899623]
The existing adversarial attack methods mainly take a gradient-based approach and generate adversarial videos with noticeable perturbations. We propose a novel Sparse Adversarial Attack via S-Brittany Invertible Neural Networks (VASTIN) to generate adversarial videos through imperceptible feature space information exchanging. experiments on UCF-101 and Kinetics-400 demonstrate that our proposed SVASTIN can generate adversarial examples with higher imperceptibility than the state-of-the-art methods with the higher fooling rate.
arXiv Detail & Related papers (2024-06-04T01:58:32Z)
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation [72.90144343056227]
We explore the visual representations produced from a pre-trained text-to-video (T2V) diffusion model for video understanding tasks. We introduce a novel framework, termed "VD-IT", tailored with dedicatedly designed components built upon a fixed T2V model. Our VD-IT achieves highly competitive results, surpassing many existing state-of-the-art methods.
arXiv Detail & Related papers (2024-03-18T17:59:58Z)
Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs [112.39389727164594]
Text-to-video (T2V) synthesis has gained increasing attention in the community, in which the recently emerged diffusion models (DMs) have promisingly shown stronger performance than the past approaches. While existing state-of-the-art DMs are competent to achieve high-resolution video generation, they may largely suffer from key limitations (e.g., action occurrence disorders, crude video motions) with respect to the temporal dynamics modeling, one of the crux of video synthesis. In this work, we investigate strengthening awareness of video dynamics for DMs, for high-quality T2V generation
arXiv Detail & Related papers (2023-08-26T08:31:48Z)
Temporal-Distributed Backdoor Attack Against Video Based Action Recognition [21.916002204426853]
We introduce a simple yet effective backdoor attack against video data. Our proposed attack, adding perturbations in a transformed domain, plants an imperceptible, temporally distributed trigger across the video frames.
arXiv Detail & Related papers (2023-08-21T22:31:54Z)
Overcomplete Representations Against Adversarial Videos [72.04912755926524]
We propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OUDefend) OUDefend is designed to balance local and global features by learning those two representations. Experimental results show that the defenses focusing on images may be ineffective to videos, while OUDefend enhances robustness against different types of adversarial videos.
arXiv Detail & Related papers (2020-12-08T08:00:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.