FeatureFool: Zero-Query Fooling of Video Models via Feature Map
- URL: http://arxiv.org/abs/2510.18362v2
- Date: Wed, 22 Oct 2025 02:44:05 GMT
- Title: FeatureFool: Zero-Query Fooling of Video Models via Feature Map
- Authors: Duoxun Tang, Xi Xiao, Guangwu Hu, Kangkang Sun, Xiao Yang, Dongyang Chen, Qing Li, Yongjie Yin, Jiyao Wang,
- Abstract summary: Black-box adversarial attacks usually require multi-round interaction with a model.<n>No attack in the video domain directly leverages feature maps to shift the clean-video feature space.<n>We propose FeatureFool, a stealthy, video-domain, zero-query black-box attack.
- Score: 19.133399082904212
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The vulnerability of deep neural networks (DNNs) has been preliminarily verified. Existing black-box adversarial attacks usually require multi-round interaction with the model and consume numerous queries, which is impractical in the real-world and hard to scale to recently emerged Video-LLMs. Moreover, no attack in the video domain directly leverages feature maps to shift the clean-video feature space. We therefore propose FeatureFool, a stealthy, video-domain, zero-query black-box attack that utilizes information extracted from a DNN to alter the feature space of clean videos. Unlike query-based methods that rely on iterative interaction, FeatureFool performs a zero-query attack by directly exploiting DNN-extracted information. This efficient approach is unprecedented in the video domain. Experiments show that FeatureFool achieves an attack success rate above 70\% against traditional video classifiers without any queries. Benefiting from the transferability of the feature map, it can also craft harmful content and bypass Video-LLM recognition. Additionally, adversarial videos generated by FeatureFool exhibit high quality in terms of SSIM, PSNR, and Temporal-Inconsistency, making the attack barely perceptible. This paper may contain violent or explicit content.
Related papers
- TenAd: A Tensor-based Low-rank Black Box Adversarial Attack for Video Classification [1.3121410433987561]
textbfTenAd is a low-rank adversarial attack that leverages the multi-dimensional properties of video data by representing videos as fourth-order tensors.<n>Our approach outperforms existing black-box adversarial attacks in terms of success rate, query efficiency, and perturbation imperceptibility.
arXiv Detail & Related papers (2025-04-01T22:35:28Z) - Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation [54.21476271127356]
Divot is a Diffusion-Powered Video Tokenizer.<n>We present Divot-unaVic through video-to-text autoregression and text-to-video generation.
arXiv Detail & Related papers (2024-12-05T18:53:04Z) - SVASTIN: Sparse Video Adversarial Attack via Spatio-Temporal Invertible Neural Networks [14.87613382899623]
The existing adversarial attack methods mainly take a gradient-based approach and generate adversarial videos with noticeable perturbations.
We propose a novel Sparse Adversarial Attack via S-Brittany Invertible Neural Networks (VASTIN) to generate adversarial videos through imperceptible feature space information exchanging.
experiments on UCF-101 and Kinetics-400 demonstrate that our proposed SVASTIN can generate adversarial examples with higher imperceptibility than the state-of-the-art methods with the higher fooling rate.
arXiv Detail & Related papers (2024-06-04T01:58:32Z) - FMM-Attack: A Flow-based Multi-modal Adversarial Attack on Video-based LLMs [57.59518049930211]
We propose the first adversarial attack tailored for video-based large language models (LLMs)
Our attack can effectively induce video-based LLMs to generate incorrect answers when videos are added with imperceptible adversarial perturbations.
Our FMM-Attack can also induce garbling in the model output, prompting video-based LLMs to hallucinate.
arXiv Detail & Related papers (2024-03-20T11:05:07Z) - Video Infringement Detection via Feature Disentanglement and Mutual
Information Maximization [51.206398602941405]
We propose to disentangle an original high-dimensional feature into multiple sub-features.
On top of the disentangled sub-features, we learn an auxiliary feature to enhance the sub-features.
Our method achieves 90.1% TOP-100 mAP on the large-scale SVD dataset and also sets the new state-of-the-art on the VCSL benchmark dataset.
arXiv Detail & Related papers (2023-09-13T10:53:12Z) - Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for
Visible-Infrared Video Person Re-Identification [24.9205771457704]
The paper proposes a new visible-infrared video person re-ID method from a novel perspective, i.e., adversarial self-attack defense and spatial-temporal relation mining.
The proposed method exhibits compelling performance on large-scale cross-modality video datasets.
arXiv Detail & Related papers (2023-07-08T05:03:10Z) - Attacking Video Recognition Models with Bullet-Screen Comments [79.53159486470858]
We introduce a novel adversarial attack, which attacks video recognition models with bullet-screen comment (BSC) attacks.
BSCs can be regarded as a kind of meaningful patch, adding it to a clean video will not affect people' s understanding of the video content, nor will arouse people' s suspicion.
arXiv Detail & Related papers (2021-10-29T08:55:50Z) - MultAV: Multiplicative Adversarial Videos [71.94264837503135]
We propose a novel attack method against video recognition models, Multiplicative Adversarial Videos (MultAV)
MultAV imposes perturbation on video data by multiplication.
Experimental results show that the model adversarially trained against additive attack is less robust to MultAV.
arXiv Detail & Related papers (2020-09-17T04:34:39Z) - Over-the-Air Adversarial Flickering Attacks against Video Recognition
Networks [54.82488484053263]
Deep neural networks for video classification may be subjected to adversarial manipulation.
We present a manipulation scheme for fooling video classifiers by introducing a flickering temporal perturbation.
The attack was implemented on several target models and the transferability of the attack was demonstrated.
arXiv Detail & Related papers (2020-02-12T17:58:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.