Attacking Video Recognition Models with Bullet-Screen Comments
- URL: http://arxiv.org/abs/2110.15629v1
- Date: Fri, 29 Oct 2021 08:55:50 GMT
- Title: Attacking Video Recognition Models with Bullet-Screen Comments
- Authors: Kai Chen, Zhipeng Wei, Jingjing Chen, Zuxuan Wu, Yu-Gang Jiang
- Abstract summary: We introduce a novel adversarial attack, which attacks video recognition models with bullet-screen comment (BSC) attacks.
BSCs can be regarded as a kind of meaningful patch, adding it to a clean video will not affect people' s understanding of the video content, nor will arouse people' s suspicion.
- Score: 79.53159486470858
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research has demonstrated that Deep Neural Networks (DNNs) are
vulnerable to adversarial patches which introducing perceptible but localized
changes to the input. Nevertheless, existing approaches have focused on
generating adversarial patches on images, their counterparts in videos have
been less explored. Compared with images, attacking videos is much more
challenging as it needs to consider not only spatial cues but also temporal
cues. To close this gap, we introduce a novel adversarial attack in this paper,
the bullet-screen comment (BSC) attack, which attacks video recognition models
with BSCs. Specifically, adversarial BSCs are generated with a Reinforcement
Learning (RL) framework, where the environment is set as the target model and
the agent plays the role of selecting the position and transparency of each
BSC. By continuously querying the target models and receiving feedback, the
agent gradually adjusts its selection strategies in order to achieve a high
fooling rate with non-overlapping BSCs. As BSCs can be regarded as a kind of
meaningful patch, adding it to a clean video will not affect people' s
understanding of the video content, nor will arouse people' s suspicion. We
conduct extensive experiments to verify the effectiveness of the proposed
method. On both UCF-101 and HMDB-51 datasets, our BSC attack method can achieve
about 90\% fooling rate when attack three mainstream video recognition models,
while only occluding \textless 8\% areas in the video.
Related papers
- Query-Efficient Video Adversarial Attack with Stylized Logo [17.268709979991996]
Video classification systems based on Deep Neural Networks (DNNs) are highly vulnerable to adversarial examples.
We propose a novel black-box video attack framework, called Stylized Logo Attack (SLA)
SLA is conducted through three steps. The first step involves building a style references set for logos, which can not only make the generated examples more natural, but also carry more target class features in the targeted attacks.
arXiv Detail & Related papers (2024-08-22T03:19:09Z) - Temporal-Distributed Backdoor Attack Against Video Based Action
Recognition [21.916002204426853]
We introduce a simple yet effective backdoor attack against video data.
Our proposed attack, adding perturbations in a transformed domain, plants an imperceptible, temporally distributed trigger across the video frames.
arXiv Detail & Related papers (2023-08-21T22:31:54Z) - Inter-frame Accelerate Attack against Video Interpolation Models [73.28751441626754]
We apply adversarial attacks to VIF models and find that the VIF models are very vulnerable to adversarial examples.
We propose a novel attack method named Inter-frame Accelerate Attack (IAA) thats the iterations as the perturbation for the previous adjacent frame.
It is shown that our method can improve attack efficiency greatly while achieving comparable attack performance with traditional methods.
arXiv Detail & Related papers (2023-05-11T03:08:48Z) - Efficient Decision-based Black-box Patch Attacks on Video Recognition [33.5640770588839]
This work first explores decision-based patch attacks on video models.
To achieve a query-efficient attack, we propose a spatial-temporal differential evolution framework.
STDE has demonstrated state-of-the-art performance in terms of threat, efficiency and imperceptibility.
arXiv Detail & Related papers (2023-03-21T15:08:35Z) - Temporal Shuffling for Defending Deep Action Recognition Models against
Adversarial Attacks [67.58887471137436]
We develop a novel defense method using temporal shuffling of input videos against adversarial attacks for action recognition models.
To the best of our knowledge, this is the first attempt to design a defense method without additional training for 3D CNN-based video action recognition models.
arXiv Detail & Related papers (2021-12-15T06:57:01Z) - Boosting the Transferability of Video Adversarial Examples via Temporal
Translation [82.0745476838865]
adversarial examples are transferable, which makes them feasible for black-box attacks in real-world applications.
We introduce a temporal translation attack method, which optimize the adversarial perturbations over a set of temporal translated video clips.
Experiments on the Kinetics-400 dataset and the UCF-101 dataset demonstrate that our method can significantly boost the transferability of video adversarial examples.
arXiv Detail & Related papers (2021-10-18T07:52:17Z) - PAT: Pseudo-Adversarial Training For Detecting Adversarial Videos [20.949656274807904]
We propose a novel yet simple algorithm called Pseudo-versa-Adrial Training (PAT) to detect the adversarial frames in a video without requiring knowledge of the attack.
Experimental results on UCF-101 and 20BN-Jester datasets show that PAT can detect the adversarial video frames and videos with a high detection rate.
arXiv Detail & Related papers (2021-09-13T04:05:46Z) - Overcomplete Representations Against Adversarial Videos [72.04912755926524]
We propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OUDefend)
OUDefend is designed to balance local and global features by learning those two representations.
Experimental results show that the defenses focusing on images may be ineffective to videos, while OUDefend enhances robustness against different types of adversarial videos.
arXiv Detail & Related papers (2020-12-08T08:00:17Z) - MultAV: Multiplicative Adversarial Videos [71.94264837503135]
We propose a novel attack method against video recognition models, Multiplicative Adversarial Videos (MultAV)
MultAV imposes perturbation on video data by multiplication.
Experimental results show that the model adversarially trained against additive attack is less robust to MultAV.
arXiv Detail & Related papers (2020-09-17T04:34:39Z) - Sparse Black-box Video Attack with Reinforcement Learning [14.624074868199287]
We formulate the black-box video attacks into a Reinforcement Learning framework.
The environment in RL is set as the recognition model, and the agent in RL plays the role of frame selecting.
We conduct a series of experiments with two mainstream video recognition models.
arXiv Detail & Related papers (2020-01-11T14:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.