Efficient Decision-based Black-box Patch Attacks on Video Recognition
- URL: http://arxiv.org/abs/2303.11917v2
- Date: Mon, 28 Aug 2023 08:54:47 GMT
- Title: Efficient Decision-based Black-box Patch Attacks on Video Recognition
- Authors: Kaixun Jiang, Zhaoyu Chen, Hao Huang, Jiafeng Wang, Dingkang Yang, Bo
Li, Yan Wang, Wenqiang Zhang
- Abstract summary: This work first explores decision-based patch attacks on video models.
To achieve a query-efficient attack, we propose a spatial-temporal differential evolution framework.
STDE has demonstrated state-of-the-art performance in terms of threat, efficiency and imperceptibility.
- Score: 33.5640770588839
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although Deep Neural Networks (DNNs) have demonstrated excellent performance,
they are vulnerable to adversarial patches that introduce perceptible and
localized perturbations to the input. Generating adversarial patches on images
has received much attention, while adversarial patches on videos have not been
well investigated. Further, decision-based attacks, where attackers only access
the predicted hard labels by querying threat models, have not been well
explored on video models either, even if they are practical in real-world video
recognition scenes. The absence of such studies leads to a huge gap in the
robustness assessment for video models. To bridge this gap, this work first
explores decision-based patch attacks on video models. We analyze that the huge
parameter space brought by videos and the minimal information returned by
decision-based models both greatly increase the attack difficulty and query
burden. To achieve a query-efficient attack, we propose a spatial-temporal
differential evolution (STDE) framework. First, STDE introduces target videos
as patch textures and only adds patches on keyframes that are adaptively
selected by temporal difference. Second, STDE takes minimizing the patch area
as the optimization objective and adopts spatialtemporal mutation and crossover
to search for the global optimum without falling into the local optimum.
Experiments show STDE has demonstrated state-of-the-art performance in terms of
threat, efficiency and imperceptibility. Hence, STDE has the potential to be a
powerful tool for evaluating the robustness of video recognition models.
Related papers
- Query-Efficient Decision-based Black-Box Patch Attack [36.043297146652414]
We propose a differential evolutionary algorithm named DevoPatch for query-efficient decision-based patch attacks.
DevoPatch outperforms the state-of-the-art black-box patch attacks in terms of patch area and attack success rate.
We conduct the vulnerability evaluation of ViT and on image classification in the decision-based patch attack setting for the first time.
arXiv Detail & Related papers (2023-07-02T05:15:43Z) - Efficient Robustness Assessment via Adversarial Spatial-Temporal Focus
on Videos [0.0]
We design the novel Adversarial spatial-temporal Focus (AstFocus) attack on videos, which performs attacks on the simultaneously focused key frames and key regions.
By continuously querying, the reduced searching space composed of key frames and key regions is becoming precise.
Experiments on four mainstream video recognition models and three widely used action recognition datasets demonstrate that the proposed AstFocus attack outperforms the SOTA methods.
arXiv Detail & Related papers (2023-01-03T00:28:57Z) - Defensive Patches for Robust Recognition in the Physical World [111.46724655123813]
Data-end defense improves robustness by operations on input data instead of modifying models.
Previous data-end defenses show low generalization against diverse noises and weak transferability across multiple models.
We propose a defensive patch generation framework to address these problems by helping models better exploit these features.
arXiv Detail & Related papers (2022-04-13T07:34:51Z) - On the Real-World Adversarial Robustness of Real-Time Semantic
Segmentation Models for Autonomous Driving [59.33715889581687]
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks.
This paper presents an evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches.
A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels.
arXiv Detail & Related papers (2022-01-05T22:33:43Z) - Attacking Video Recognition Models with Bullet-Screen Comments [79.53159486470858]
We introduce a novel adversarial attack, which attacks video recognition models with bullet-screen comment (BSC) attacks.
BSCs can be regarded as a kind of meaningful patch, adding it to a clean video will not affect people' s understanding of the video content, nor will arouse people' s suspicion.
arXiv Detail & Related papers (2021-10-29T08:55:50Z) - Reinforcement Learning Based Sparse Black-box Adversarial Attack on
Video Recognition Models [3.029434408969759]
Black-box adversarial attacks are only performed on selected key regions and key frames.
We propose a reinforcement learning based frame selection strategy to speed up the attack process.
A range of empirical results on real datasets demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2021-08-29T12:22:40Z) - Robust Unsupervised Video Anomaly Detection by Multi-Path Frame
Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design.
Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z) - Bias-based Universal Adversarial Patch Attack for Automatic Check-out [59.355948824578434]
Adversarial examples are inputs with imperceptible perturbations that easily misleading deep neural networks(DNNs)
Existing strategies failed to generate adversarial patches with strong generalization ability.
This paper proposes a bias-based framework to generate class-agnostic universal adversarial patches with strong generalization ability.
arXiv Detail & Related papers (2020-05-19T07:38:54Z) - Motion-Excited Sampler: Video Adversarial Attack with Sparked Prior [63.11478060678794]
We propose an effective motion-excited sampler to obtain motion-aware noise prior.
By using the sparked prior in gradient estimation, we can successfully attack a variety of video classification models with fewer number of queries.
arXiv Detail & Related papers (2020-03-17T10:54:12Z) - Sparse Black-box Video Attack with Reinforcement Learning [14.624074868199287]
We formulate the black-box video attacks into a Reinforcement Learning framework.
The environment in RL is set as the recognition model, and the agent in RL plays the role of frame selecting.
We conduct a series of experiments with two mainstream video recognition models.
arXiv Detail & Related papers (2020-01-11T14:09:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.