Cross-Modal Transferable Image-to-Video Attack on Video Quality Metrics
- URL: http://arxiv.org/abs/2501.08415v1
- Date: Tue, 14 Jan 2025 20:12:09 GMT
- Title: Cross-Modal Transferable Image-to-Video Attack on Video Quality Metrics
- Authors: Georgii Gotin, Ekaterina Shumitskaya, Anastasia Antsiferova, Dmitriy Vatolin,
- Abstract summary: Modern image and video quality assessment (IQA/VQA) metrics are vulnerable to adversarial attacks.
Most of the attacks studied in the literature are white-box attacks, while black-box attacks in the context of VQA have received less attention.
We propose a cross-modal attack method, IC2VQA, aimed at exploring the vulnerabilities of modern VQA models.
- Score: 3.7855740990304736
- License:
- Abstract: Recent studies have revealed that modern image and video quality assessment (IQA/VQA) metrics are vulnerable to adversarial attacks. An attacker can manipulate a video through preprocessing to artificially increase its quality score according to a certain metric, despite no actual improvement in visual quality. Most of the attacks studied in the literature are white-box attacks, while black-box attacks in the context of VQA have received less attention. Moreover, some research indicates a lack of transferability of adversarial examples generated for one model to another when applied to VQA. In this paper, we propose a cross-modal attack method, IC2VQA, aimed at exploring the vulnerabilities of modern VQA models. This approach is motivated by the observation that the low-level feature spaces of images and videos are similar. We investigate the transferability of adversarial perturbations across different modalities; specifically, we analyze how adversarial perturbations generated on a white-box IQA model with an additional CLIP module can effectively target a VQA model. The addition of the CLIP module serves as a valuable aid in increasing transferability, as the CLIP model is known for its effective capture of low-level semantics. Extensive experiments demonstrate that IC2VQA achieves a high success rate in attacking three black-box VQA models. We compare our method with existing black-box attack strategies, highlighting its superiority in terms of attack success within the same number of iterations and levels of attack strength. We believe that the proposed method will contribute to the deeper analysis of robust VQA metrics.
Related papers
- Backdoor Attacks against No-Reference Image Quality Assessment Models via a Scalable Trigger [76.36315347198195]
No-Reference Image Quality Assessment (NR-IQA) plays a critical role in evaluating and optimizing computer vision systems.
Recent research indicates that NR-IQA models are susceptible to adversarial attacks.
We present a novel poisoning-based backdoor attack against NR-IQA (BAIQA)
arXiv Detail & Related papers (2024-12-10T08:07:19Z) - Secure Video Quality Assessment Resisting Adversarial Attacks [14.583834512620024]
Recent studies have revealed the vulnerability of existing VQA models against adversarial attacks.
This paper first attempts to investigate general adversarial defense principles, aiming at endowing existing VQA models with security.
We present a novel VQA framework from the security-oriented perspective, termed SecureVQA.
arXiv Detail & Related papers (2024-10-09T13:27:06Z) - AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - AICAttack: Adversarial Image Captioning Attack with Attention-Based Optimization [13.045125782574306]
This paper presents a novel adversarial attack strategy, AICAttack, designed to attack image captioning models through subtle perturbations on images.
operating within a black-box attack scenario, our algorithm requires no access to the target model's architecture, parameters, or gradient information.
We demonstrate AICAttack's effectiveness through extensive experiments on benchmark datasets against multiple victim models.
arXiv Detail & Related papers (2024-02-19T08:27:23Z) - VQAttack: Transferable Adversarial Attacks on Visual Question Answering
via Pre-trained Models [58.21452697997078]
We propose a novel VQAttack model, which can generate both image and text perturbations with the designed modules.
Experimental results on two VQA datasets with five validated models demonstrate the effectiveness of the proposed VQAttack.
arXiv Detail & Related papers (2024-02-16T21:17:42Z) - Vulnerabilities in Video Quality Assessment Models: The Challenge of
Adversarial Attacks [15.127749101160672]
No-Reference Video Quality Assessment (NR-VQA) plays an essential role in improving the viewing experience of end-users.
Recent NR-VQA models based on CNNs and Transformers have achieved outstanding performance.
We make the first attempt to evaluate the robustness of NR-VQA models against adversarial attacks.
arXiv Detail & Related papers (2023-09-24T11:17:38Z) - Inter-frame Accelerate Attack against Video Interpolation Models [73.28751441626754]
We apply adversarial attacks to VIF models and find that the VIF models are very vulnerable to adversarial examples.
We propose a novel attack method named Inter-frame Accelerate Attack (IAA) thats the iterations as the perturbation for the previous adjacent frame.
It is shown that our method can improve attack efficiency greatly while achieving comparable attack performance with traditional methods.
arXiv Detail & Related papers (2023-05-11T03:08:48Z) - Practical No-box Adversarial Attacks with Training-free Hybrid Image Transformation [94.30136898739448]
We show the existence of a textbftraining-free adversarial perturbation under the no-box threat model.
Motivated by our observation that high-frequency component (HFC) domains in low-level features, we attack an image mainly by manipulating its frequency components.
Our method is even competitive to mainstream transfer-based black-box attacks.
arXiv Detail & Related papers (2022-03-09T09:51:00Z) - Cross-Modal Transferable Adversarial Attacks from Images to Videos [82.0745476838865]
Recent studies have shown that adversarial examples hand-crafted on one white-box model can be used to attack other black-box models.
We propose a simple yet effective cross-modal attack method, named as Image To Video (I2V) attack.
I2V generates adversarial frames by minimizing the cosine similarity between features of pre-trained image models from adversarial and benign examples.
arXiv Detail & Related papers (2021-12-10T08:19:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.