Related papers: SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures

SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures

URL: http://arxiv.org/abs/2508.06127v1
Date: Fri, 08 Aug 2025 08:47:26 GMT
Title: SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures
Authors: Yi Qin, Rui Wang, Tao Huang, Tong Xiao, Liping Jing,
Abstract summary: The Segment Anything Model (SAM) transforms interactive segmentation with zero-shot abilities.<n>SAM's inherent vulnerabilities present a single-point risk, potentially leading to the failure of numerous downstream applications.<n>We propose a novel method that leverages only the encoder of SAM for generating transferable adversarial examples.
Score: 43.17595238353687
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While the Segment Anything Model (SAM) transforms interactive segmentation with zero-shot abilities, its inherent vulnerabilities present a single-point risk, potentially leading to the failure of numerous downstream applications. Proactively evaluating these transferable vulnerabilities is thus imperative. Prior adversarial attacks on SAM often present limited transferability due to insufficient exploration of common weakness across domains. To address this, we propose Vertex-Refining Simplicial Complex Attack (VeSCA), a novel method that leverages only the encoder of SAM for generating transferable adversarial examples. Specifically, it achieves this by explicitly characterizing the shared vulnerable regions between SAM and downstream models through a parametric simplicial complex. Our goal is to identify such complexes within adversarially potent regions by iterative vertex-wise refinement. A lightweight domain re-adaptation strategy is introduced to bridge domain divergence using minimal reference data during the initialization of simplicial complex. Ultimately, VeSCA generates consistently transferable adversarial examples through random simplicial complex sampling. Extensive experiments demonstrate that VeSCA achieves performance improved by 12.7% compared to state-of-the-art methods across three downstream model categories across five domain-specific datasets. Our findings further highlight the downstream model risks posed by SAM's vulnerabilities and emphasize the urgency of developing more robust foundation models.

Related papers

Quantifying the Risk of Transferred Black Box Attacks [0.0]
Neural networks have become pervasive across various applications, including security-related products.<n>This paper investigates the complexities involved in resilience testing against transferred adversarial attacks.<n>We propose a targeted resilience testing framework that employs surrogate models strategically selected based on Centered Kernel Alignment (CKA) similarity.
arXiv Detail & Related papers (2025-11-07T09:34:43Z)
Enhancing Zero-Shot Anomaly Detection: CLIP-SAM Collaboration with Cascaded Prompts [5.225009704851243]
This paper proposes a novel two-stage framework, for zero-shot anomaly segmentation tasks in industrial anomaly detection.<n>To mitigate SAM's inclination towards object segmentation, we propose the Co-Feature Point Prompt Generation module.<n>To further optimize SAM's segmentation results, we introduce the Cascaded Prompts for SAM (CPS) module.
arXiv Detail & Related papers (2025-10-13T05:53:49Z)
ST-SAM: SAM-Driven Self-Training Framework for Semi-Supervised Camouflaged Object Detection [14.06736878203419]
Semi-supervised Camouflaged Object Detection (SSCOD) aims to reduce reliance on costly pixel-level annotations.<n>Existing SSCOD methods suffer from severe prediction bias and error propagation under scarce supervision.<n>We propose ST-SAM, a highly annotation-efficient yet concise framework.
arXiv Detail & Related papers (2025-07-31T07:41:30Z)
Exploiting Edge Features for Transferable Adversarial Attacks in Distributed Machine Learning [54.26807397329468]
This work explores a previously overlooked vulnerability in distributed deep learning systems.<n>An adversary who intercepts the intermediate features transmitted between them can still pose a serious threat.<n>We propose an exploitation strategy specifically designed for distributed settings.
arXiv Detail & Related papers (2025-07-09T20:09:00Z)
A Simple DropConnect Approach to Transfer-based Targeted Attack [43.039945949426546]
We study the problem of transfer-based black-box attack, where adversarial samples generated using a single surrogate model are directly applied to target models.<n>We propose to Mitigate perturbation Co-adaptation by DropConnect to enhance transferability.<n>In the challenging scenario of transferring from a CNN-based model to Transformer-based models, MCD achieves 13% higher average ASRs compared with state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-24T12:29:23Z)
Boosting Adversarial Transferability with Spatial Adversarial Alignment [56.97809949196889]
Deep neural networks are vulnerable to adversarial examples that exhibit transferability across various models.<n>We propose a technique that employs an alignment loss and leverages a witness model to fine-tune the surrogate model.<n>Experiments on various architectures on ImageNet show that aligned surrogate models based on SAA can provide higher transferable adversarial examples.
arXiv Detail & Related papers (2025-01-02T02:35:47Z)
Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM)<n>To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z)
Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions. We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.