Task-Agnostic Attacks Against Vision Foundation Models
- URL: http://arxiv.org/abs/2503.03842v1
- Date: Wed, 05 Mar 2025 19:15:14 GMT
- Title: Task-Agnostic Attacks Against Vision Foundation Models
- Authors: Brian Pulfer, Yury Belousov, Vitaliy Kinakh, Teddy Furon, Slava Voloshynovskiy,
- Abstract summary: It has become standard practice for machine learning practitioners to adopt publicly available pre-trained vision foundation models.<n>The study of attacks on such foundation models and their impact to multiple downstream tasks remains vastly unexplored.<n>This work proposes a general framework that forges task-agnostic adversarial examples by maximally disrupting the feature representation obtained with foundation models.
- Score: 12.487589700031661
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The study of security in machine learning mainly focuses on downstream task-specific attacks, where the adversarial example is obtained by optimizing a loss function specific to the downstream task. At the same time, it has become standard practice for machine learning practitioners to adopt publicly available pre-trained vision foundation models, effectively sharing a common backbone architecture across a multitude of applications such as classification, segmentation, depth estimation, retrieval, question-answering and more. The study of attacks on such foundation models and their impact to multiple downstream tasks remains vastly unexplored. This work proposes a general framework that forges task-agnostic adversarial examples by maximally disrupting the feature representation obtained with foundation models. We extensively evaluate the security of the feature representations obtained by popular vision foundation models by measuring the impact of this attack on multiple downstream tasks and its transferability between models.
Related papers
- On Transfer-based Universal Attacks in Pure Black-box Setting [94.92884394009288]
We study the role of prior knowledge of the target model data and number of classes in attack performance.
We also provide several interesting insights based on our analysis, and demonstrate that priors cause overestimation in transferability scores.
arXiv Detail & Related papers (2025-04-11T10:41:20Z) - Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.<n>Models may behave unreliably due to poorly explored failure modes.<n> causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z) - On the Lack of Robustness of Binary Function Similarity Systems [13.842698930725625]
We assess the resiliency of state-of-the-art machine learning models against adversarial attacks.<n>We demonstrate that this attack is successful in compromising all the models, achieving average attack success rates of 57.06% and 95.81% depending on the problem settings.
arXiv Detail & Related papers (2024-12-05T13:54:53Z) - Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks [9.207022068713867]
We present a comprehensive empirical evaluation of the adversarial robustness of self-supervised vision encoders across multiple downstream tasks.
Our attacks operate in the encoder embedding space and at the downstream task output level.
Since the purpose of a foundation model is to cater to multiple applications at once, our findings reveal the need to enhance encoder robustness more broadly.
arXiv Detail & Related papers (2024-07-17T14:12:34Z) - Introducing Foundation Models as Surrogate Models: Advancing Towards
More Practical Adversarial Attacks [15.882687207499373]
No-box adversarial attacks are becoming more practical and challenging for AI systems.
This paper recasts adversarial attack as a downstream task by introducing foundational models as surrogate models.
arXiv Detail & Related papers (2023-07-13T08:10:48Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - On the Robustness of Deep Clustering Models: Adversarial Attacks and
Defenses [14.951655356042947]
Clustering models constitute a class of unsupervised machine learning methods which are used in a number of application pipelines.
We propose a blackbox attack using Generative Adversarial Networks (GANs) where the adversary does not know which deep clustering model is being used.
We analyze our attack against multiple state-of-the-art deep clustering models and real-world datasets, and find that it is highly successful.
arXiv Detail & Related papers (2022-10-04T22:32:02Z) - On the Robustness of Random Forest Against Untargeted Data Poisoning: An
Ensemble-Based Approach [42.81632484264218]
In machine learning models, perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy.
This paper aims to implement a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks.
arXiv Detail & Related papers (2022-09-28T11:41:38Z) - Benchmarking the Robustness of Instance Segmentation Models [3.1287804585804073]
This paper presents a comprehensive evaluation of instance segmentation models with respect to real-world image corruptions and out-of-domain image collections.
The out-of-domain image evaluation shows the generalization capability of models, an essential aspect of real-world applications.
Specifically, this benchmark study includes state-of-the-art network architectures, network backbones, normalization layers, models trained starting from scratch or ImageNet pretrained networks.
arXiv Detail & Related papers (2021-09-02T17:50:07Z) - Deep Image Destruction: A Comprehensive Study on Vulnerability of Deep
Image-to-Image Models against Adversarial Attacks [104.8737334237993]
We present comprehensive investigations into the vulnerability of deep image-to-image models to adversarial attacks.
For five popular image-to-image tasks, 16 deep models are analyzed from various standpoints.
We show that unlike in image classification tasks, the performance degradation on image-to-image tasks can largely differ depending on various factors.
arXiv Detail & Related papers (2021-04-30T14:20:33Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.