Zero-Shot Robustification of Zero-Shot Models
- URL: http://arxiv.org/abs/2309.04344v2
- Date: Mon, 12 Feb 2024 17:15:52 GMT
- Title: Zero-Shot Robustification of Zero-Shot Models
- Authors: Dyah Adila, Changho Shin, Linrong Cai, Frederic Sala
- Abstract summary: We propose RoboShot, a method that improves the robustness of pretrained model embeddings in a fully zero-shot fashion.
First, we use language models (LMs) to obtain useful insights from task descriptions.
These insights are embedded and used to remove harmful and boost useful components in embeddings -- without any supervision.
- Score: 13.143596481809508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot inference is a powerful paradigm that enables the use of large
pretrained models for downstream classification tasks without further training.
However, these models are vulnerable to inherited biases that can impact their
performance. The traditional solution is fine-tuning, but this undermines the
key advantage of pretrained models, which is their ability to be used
out-of-the-box. We propose RoboShot, a method that improves the robustness of
pretrained model embeddings in a fully zero-shot fashion. First, we use
language models (LMs) to obtain useful insights from task descriptions. These
insights are embedded and used to remove harmful and boost useful components in
embeddings -- without any supervision. Theoretically, we provide a simple and
tractable model for biases in zero-shot embeddings and give a result
characterizing under what conditions our approach can boost performance.
Empirically, we evaluate RoboShot on nine image and NLP classification tasks
and show an average improvement of 15.98% on worst group accuracy, with trivial
decrease in overall accuracy over several zero-shot baselines. Additionally, we
demonstrate that RoboShot is compatible with a variety of pretrained and
language models and propose a way to further boost performance with a zero-shot
adaptation variant.
Related papers
- Enabling Small Models for Zero-Shot Classification through Model Label Learning [50.68074833512999]
We introduce a novel paradigm, Model Label Learning (MLL), which bridges the gap between models and their functionalities.
Experiments on seven real-world datasets validate the effectiveness and efficiency of MLL.
arXiv Detail & Related papers (2024-08-21T09:08:26Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness [52.9493817508055]
We propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the model's zero-shot adversarial robustness.
Our approach consistently improves clean accuracy by an average of 8.72%.
arXiv Detail & Related papers (2024-01-09T04:33:03Z) - Understanding Zero-Shot Adversarial Robustness for Large-Scale Models [31.295249927085475]
We identify and explore the problem of emphadapting large-scale models for zero-shot adversarial robustness.
We propose a text-guided contrastive adversarial training loss, which aligns the text embeddings and the adversarial visual features with contrastive learning.
Our approach significantly improves the zero-shot adversarial robustness over CLIP, seeing an average improvement of over 31 points over ImageNet and 15 zero-shot datasets.
arXiv Detail & Related papers (2022-12-14T04:08:56Z) - Composing Ensembles of Pre-trained Models via Iterative Consensus [95.10641301155232]
We propose a unified framework for composing ensembles of different pre-trained models.
We use pre-trained models as "generators" or "scorers" and compose them via closed-loop iterative consensus optimization.
We demonstrate that consensus achieved by an ensemble of scorers outperforms the feedback of a single scorer.
arXiv Detail & Related papers (2022-10-20T18:46:31Z) - Language Models in the Loop: Incorporating Prompting into Weak
Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited.
Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z) - Voting based ensemble improves robustness of defensive models [82.70303474487105]
We study whether it is possible to create an ensemble to further improve robustness.
By ensembling several state-of-the-art pre-trained defense models, our method can achieve a 59.8% robust accuracy.
arXiv Detail & Related papers (2020-11-28T00:08:45Z) - Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning [134.15174177472807]
We introduce adversarial training into self-supervision, to provide general-purpose robust pre-trained models for the first time.
We conduct extensive experiments to demonstrate that the proposed framework achieves large performance margins.
arXiv Detail & Related papers (2020-03-28T18:28:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.