Defending Against Multiple and Unforeseen Adversarial Videos
- URL: http://arxiv.org/abs/2009.05244v3
- Date: Tue, 14 Dec 2021 06:38:07 GMT
- Title: Defending Against Multiple and Unforeseen Adversarial Videos
- Authors: Shao-Yuan Lo, Vishal M. Patel
- Abstract summary: We propose one of the first defense strategies against multiple types of adversarial videos for video recognition.
The proposed method, referred to as MultiBN, performs adversarial training on multiple video types using multiple independent batch normalization layers.
With a multiple BN structure, each BN brach is responsible for learning the distribution of a single perturbation type and thus provides more precise distribution estimations.
- Score: 71.94264837503135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial robustness of deep neural networks has been actively
investigated. However, most existing defense approaches are limited to a
specific type of adversarial perturbations. Specifically, they often fail to
offer resistance to multiple attack types simultaneously, i.e., they lack
multi-perturbation robustness. Furthermore, compared to image recognition
problems, the adversarial robustness of video recognition models is relatively
unexplored. While several studies have proposed how to generate adversarial
videos, only a handful of approaches about defense strategies have been
published in the literature. In this paper, we propose one of the first defense
strategies against multiple types of adversarial videos for video recognition.
The proposed method, referred to as MultiBN, performs adversarial training on
multiple adversarial video types using multiple independent batch normalization
(BN) layers with a learning-based BN selection module. With a multiple BN
structure, each BN brach is responsible for learning the distribution of a
single perturbation type and thus provides more precise distribution
estimations. This mechanism benefits dealing with multiple perturbation types.
The BN selection module detects the attack type of an input video and sends it
to the corresponding BN branch, making MultiBN fully automatic and allowing
end-to-end training. Compared to present adversarial training approaches, the
proposed MultiBN exhibits stronger multi-perturbation robustness against
different and even unforeseen adversarial video types, ranging from Lp-bounded
attacks and physically realizable attacks. This holds true on different
datasets and target models. Moreover, we conduct an extensive analysis to study
the properties of the multiple BN structure.
Related papers
- Revisiting the Adversarial Robustness of Vision Language Models: a Multimodal Perspective [42.04728834962863]
Pretrained vision-language models (VLMs) like CLIP exhibit exceptional generalization across diverse downstream tasks.
Recent studies reveal their vulnerability to adversarial attacks, with defenses against text-based and multimodal attacks remaining largely unexplored.
This work presents the first comprehensive study on improving the adversarial robustness of VLMs against attacks targeting image, text, and multimodal inputs.
arXiv Detail & Related papers (2024-04-30T06:34:21Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Attacking Video Recognition Models with Bullet-Screen Comments [79.53159486470858]
We introduce a novel adversarial attack, which attacks video recognition models with bullet-screen comment (BSC) attacks.
BSCs can be regarded as a kind of meaningful patch, adding it to a clean video will not affect people' s understanding of the video content, nor will arouse people' s suspicion.
arXiv Detail & Related papers (2021-10-29T08:55:50Z) - PAT: Pseudo-Adversarial Training For Detecting Adversarial Videos [20.949656274807904]
We propose a novel yet simple algorithm called Pseudo-versa-Adrial Training (PAT) to detect the adversarial frames in a video without requiring knowledge of the attack.
Experimental results on UCF-101 and 20BN-Jester datasets show that PAT can detect the adversarial video frames and videos with a high detection rate.
arXiv Detail & Related papers (2021-09-13T04:05:46Z) - Overcomplete Representations Against Adversarial Videos [72.04912755926524]
We propose a novel Over-and-Under complete restoration network for Defending against adversarial videos (OUDefend)
OUDefend is designed to balance local and global features by learning those two representations.
Experimental results show that the defenses focusing on images may be ineffective to videos, while OUDefend enhances robustness against different types of adversarial videos.
arXiv Detail & Related papers (2020-12-08T08:00:17Z) - MultAV: Multiplicative Adversarial Videos [71.94264837503135]
We propose a novel attack method against video recognition models, Multiplicative Adversarial Videos (MultAV)
MultAV imposes perturbation on video data by multiplication.
Experimental results show that the model adversarially trained against additive attack is less robust to MultAV.
arXiv Detail & Related papers (2020-09-17T04:34:39Z) - Block Switching: A Stochastic Approach for Deep Learning Security [75.92824098268471]
Recent study of adversarial attacks has revealed the vulnerability of modern deep learning models.
In this paper, we introduce Block Switching (BS), a defense strategy against adversarial attacks based on onity.
arXiv Detail & Related papers (2020-02-18T23:14:25Z) - Sparse Black-box Video Attack with Reinforcement Learning [14.624074868199287]
We formulate the black-box video attacks into a Reinforcement Learning framework.
The environment in RL is set as the recognition model, and the agent in RL plays the role of frame selecting.
We conduct a series of experiments with two mainstream video recognition models.
arXiv Detail & Related papers (2020-01-11T14:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.