SAM Meets UAP: Attacking Segment Anything Model With Universal Adversarial Perturbation
- URL: http://arxiv.org/abs/2310.12431v2
- Date: Tue, 20 Aug 2024 12:10:24 GMT
- Title: SAM Meets UAP: Attacking Segment Anything Model With Universal Adversarial Perturbation
- Authors: Dongshen Han, Chaoning Zhang, Sheng Zheng, Chang Lu, Yang Yang, Heng Tao Shen,
- Abstract summary: We investigate whether it is possible to attack Segment Anything Model (SAM) with image-aversagnostic Universal Adrial Perturbation (UAP)
We propose a novel perturbation-centric framework that results in a UAP generation method based on self-supervised contrastive learning (CL)
The effectiveness of our proposed CL-based UAP generation method is validated by both quantitative and qualitative results.
- Score: 61.732503554088524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As Segment Anything Model (SAM) becomes a popular foundation model in computer vision, its adversarial robustness has become a concern that cannot be ignored. This works investigates whether it is possible to attack SAM with image-agnostic Universal Adversarial Perturbation (UAP). In other words, we seek a single perturbation that can fool the SAM to predict invalid masks for most (if not all) images. We demonstrate convetional image-centric attack framework is effective for image-independent attacks but fails for universal adversarial attack. To this end, we propose a novel perturbation-centric framework that results in a UAP generation method based on self-supervised contrastive learning (CL), where the UAP is set to the anchor sample and the positive sample is augmented from the UAP. The representations of negative samples are obtained from the image encoder in advance and saved in a memory bank. The effectiveness of our proposed CL-based UAP generation method is validated by both quantitative and qualitative results. On top of the ablation study to understand various components in our proposed method, we shed light on the roles of positive and negative samples in making the generated UAP effective for attacking SAM.
Related papers
- Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM)
To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z) - DarkSAM: Fooling Segment Anything Model to Segment Nothing [25.67725506581337]
Segment Anything Model (SAM) has recently gained much attention for its outstanding generalization to unseen data and tasks.
We propose DarkSAM, the first prompt-free universal attack framework against SAM, including a semantic decoupling-based spatial attack and a texture distortion-based frequency attack.
Experimental results on four datasets for SAM and its two variant models demonstrate the powerful attack capability and transferability of DarkSAM.
arXiv Detail & Related papers (2024-09-26T14:20:14Z) - MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Black-box Targeted Adversarial Attack on Segment Anything (SAM) [24.927514923402775]
This work aims to achieve a targeted adversarial attack (TAA) on Segment Anything Model (SAM)
Specifically, under a certain prompt, the goal is to make the predicted mask of an adversarial example resemble that of a given target image.
We propose a novel regularization loss to enhance the cross-model transferability by increasing the feature dominance of adversarial images over random natural images.
arXiv Detail & Related papers (2023-10-16T02:09:03Z) - SAM Meets Robotic Surgery: An Empirical Study on Generalization,
Robustness and Adaptation [15.995869434429274]
The Segment Anything Model (SAM) serves as a fundamental model for semantic segmentation.
We examine SAM's robustness and zero-shot generalizability in the field of robotic surgery.
arXiv Detail & Related papers (2023-08-14T14:09:41Z) - Attack-SAM: Towards Attacking Segment Anything Model With Adversarial
Examples [68.5719552703438]
Segment Anything Model (SAM) has attracted significant attention recently, due to its impressive performance on various downstream tasks.
Deep vision models are widely recognized as vulnerable to adversarial examples, which fool the model to make wrong predictions with imperceptible perturbation.
This work is the first of its kind to conduct a comprehensive investigation on how to attack SAM with adversarial examples.
arXiv Detail & Related papers (2023-05-01T15:08:17Z) - FG-UAP: Feature-Gathering Universal Adversarial Perturbation [15.99512720802142]
We propose to generate Universal Adversarial Perturbation (UAP) by attacking the layer where Neural Collapse (NC) happens.
Because of NC, the proposed attack could gather all the natural images' features to its surrounding, which is hence called Feature-Gathering UAP (FG-UAP)
We evaluate the effectiveness of our proposed algorithm on abundant experiments, including untargeted and targeted universal attacks, attacks under limited dataset, and transfer-based black-box attacks.
arXiv Detail & Related papers (2022-09-27T02:03:42Z) - CARBEN: Composite Adversarial Robustness Benchmark [70.05004034081377]
This paper demonstrates how composite adversarial attack (CAA) affects the resulting image.
It provides real-time inferences of different models, which will facilitate users' configuration of the parameters of the attack level.
A leaderboard to benchmark adversarial robustness against CAA is also introduced.
arXiv Detail & Related papers (2022-07-16T01:08:44Z) - CD-UAP: Class Discriminative Universal Adversarial Perturbation [83.60161052867534]
A single universal adversarial perturbation (UAP) can be added to all natural images to change most of their predicted class labels.
We propose a new universal attack method to generate a single perturbation that fools a target network to misclassify only a chosen group of classes.
arXiv Detail & Related papers (2020-10-07T09:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.