Related papers: Adversarial Attacks on Foundational Vision Models

Adversarial Attacks on Foundational Vision Models

URL: http://arxiv.org/abs/2308.14597v1
Date: Mon, 28 Aug 2023 14:09:02 GMT
Title: Adversarial Attacks on Foundational Vision Models
Authors: Nathan Inkawhich, Gwendolyn McDonald, Ryan Luley
Abstract summary: Rapid progress is being made in developing large, pretrained, task-agnostic foundational vision models. These models do not have to be finetuned downstream, and can simply be used in zero-shot or with a lightweight probing head. The goal of this work is to identify several key adversarial vulnerabilities of these models in an effort to make future designs more robust.
Score: 6.5530318775587
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Rapid progress is being made in developing large, pretrained, task-agnostic foundational vision models such as CLIP, ALIGN, DINOv2, etc. In fact, we are approaching the point where these models do not have to be finetuned downstream, and can simply be used in zero-shot or with a lightweight probing head. Critically, given the complexity of working at this scale, there is a bottleneck where relatively few organizations in the world are executing the training then sharing the models on centralized platforms such as HuggingFace and torch.hub. The goal of this work is to identify several key adversarial vulnerabilities of these models in an effort to make future designs more robust. Intuitively, our attacks manipulate deep feature representations to fool an out-of-distribution (OOD) detector which will be required when using these open-world-aware models to solve closed-set downstream tasks. Our methods reliably make in-distribution (ID) images (w.r.t. a downstream task) be predicted as OOD and vice versa while existing in extremely low-knowledge-assumption threat models. We show our attacks to be potent in whitebox and blackbox settings, as well as when transferred across foundational model types (e.g., attack DINOv2 with CLIP)! This work is only just the beginning of a long journey towards adversarially robust foundational vision models.

Related papers

Exploiting Edge Features for Transferable Adversarial Attacks in Distributed Machine Learning [54.26807397329468]
This work explores a previously overlooked vulnerability in distributed deep learning systems.<n>An adversary who intercepts the intermediate features transmitted between them can still pose a serious threat.<n>We propose an exploitation strategy specifically designed for distributed settings.
arXiv Detail & Related papers (2025-07-09T20:09:00Z)
Attacking Attention of Foundation Models Disrupts Downstream Tasks [11.538345159297839]
Foundation models are large models, trained on broad data that deliver high accuracy in many downstream tasks.<n>These models are vulnerable to adversarial attacks.<n>This paper studies the vulnerabilities of vision foundation models, focusing specifically on CLIP and ViTs.<n>We introduce a novel attack, targeting the structure of transformer-based architectures in a task-agnostic fashion.
arXiv Detail & Related papers (2025-06-03T19:42:48Z)
LoBAM: LoRA-Based Backdoor Attack on Model Merging [27.57659381949931]
Model merging is an emerging technique that integrates multiple models fine-tuned on different tasks to create a versatile model that excels in multiple domains. Existing works try to demonstrate the risk of such attacks by assuming substantial computational resources. We propose LoBAM, a method that yields high attack success rate with minimal training resources.
arXiv Detail & Related papers (2024-11-23T20:41:24Z)
Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks? [23.660089146157507]
We show that foundation models pre-trained on web-scale vision-language data can serve as a basis for attacking downstream systems. We propose a simple yet effective adversarial attack strategy termed Patch Representation Misalignment. Our findings highlight the concerning safety risks introduced by the extensive usage of public foundational models in the development of downstream systems.
arXiv Detail & Related papers (2024-03-19T12:51:39Z)
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models [42.379680603462155]
We propose an unsupervised adversarial fine-tuning scheme to obtain a robust CLIP vision encoder. We show that stealth-attacks on users of LVLMs by a malicious third party providing manipulated images are no longer possible once one replaces the CLIP model with our robust one.
arXiv Detail & Related papers (2024-02-19T18:09:48Z)
Towards Scalable and Robust Model Versioning [30.249607205048125]
Malicious incursions aimed at gaining access to deep learning models are on the rise. We show how to generate multiple versions of a model that possess different attack properties. We show theoretically that this can be accomplished by incorporating parameterized hidden distributions into the model training data.
arXiv Detail & Related papers (2024-01-17T19:55:49Z)
SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models [74.58014281829946]
We analyze the effectiveness of several representative attacks/defenses, including model stealing attacks, membership inference attacks, and backdoor detection on public models. Our evaluation empirically shows the performance of these attacks/defenses can vary significantly on public models compared to self-trained models.
arXiv Detail & Related papers (2023-10-19T11:49:22Z)
Introducing Foundation Models as Surrogate Models: Advancing Towards More Practical Adversarial Attacks [15.882687207499373]
No-box adversarial attacks are becoming more practical and challenging for AI systems. This paper recasts adversarial attack as a downstream task by introducing foundational models as surrogate models.
arXiv Detail & Related papers (2023-07-13T08:10:48Z)
Towards Efficient Task-Driven Model Reprogramming with Foundation Models [52.411508216448716]
Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data. However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations. This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task.
arXiv Detail & Related papers (2023-04-05T07:28:33Z)
On the Robustness of Deep Clustering Models: Adversarial Attacks and Defenses [14.951655356042947]
Clustering models constitute a class of unsupervised machine learning methods which are used in a number of application pipelines. We propose a blackbox attack using Generative Adversarial Networks (GANs) where the adversary does not know which deep clustering model is being used. We analyze our attack against multiple state-of-the-art deep clustering models and real-world datasets, and find that it is highly successful.
arXiv Detail & Related papers (2022-10-04T22:32:02Z)
"What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models. We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z)
Orthogonal Deep Models As Defense Against Black-Box Attacks [71.23669614195195]
We study the inherent weakness of deep models in black-box settings where the attacker may develop the attack using a model similar to the targeted model. We introduce a novel gradient regularization scheme that encourages the internal representation of a deep model to be orthogonal to another. We verify the effectiveness of our technique on a variety of large-scale models.
arXiv Detail & Related papers (2020-06-26T08:29:05Z)
Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem. We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.