Benchmarking Robustness of Adaptation Methods on Pre-trained
Vision-Language Models
- URL: http://arxiv.org/abs/2306.02080v3
- Date: Sat, 18 Nov 2023 08:51:08 GMT
- Title: Benchmarking Robustness of Adaptation Methods on Pre-trained
Vision-Language Models
- Authors: Shuo Chen, Jindong Gu, Zhen Han, Yunpu Ma, Philip Torr, Volker Tresp
- Abstract summary: We assess the robustness of 11 widely-used adaptation methods across 4 vision-language datasets under multimodal corruptions.
Our analysis reveals that: 1) Adaptation methods are more sensitive to text corruptions than visual corruptions.
Contrary to expectations, our findings indicate that increasing the number of adaptation data and parameters does not guarantee enhanced robustness.
- Score: 49.595973365500775
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Various adaptation methods, such as LoRA, prompts, and adapters, have been
proposed to enhance the performance of pre-trained vision-language models in
specific domains. The robustness of these adaptation methods against
distribution shifts have not been studied. In this study, we assess the
robustness of 11 widely-used adaptation methods across 4 vision-language
datasets under multimodal corruptions. Concretely, we introduce 7 benchmark
datasets, including 96 visual and 87 textual corruptions, to investigate the
robustness of different adaptation methods, the impact of available adaptation
examples, and the influence of trainable parameter size during adaptation. Our
analysis reveals that: 1) Adaptation methods are more sensitive to text
corruptions than visual corruptions. 2) Full fine-tuning does not consistently
provide the highest robustness; instead, adapters can achieve better robustness
with comparable clean performance. 3) Contrary to expectations, our findings
indicate that increasing the number of adaptation data and parameters does not
guarantee enhanced robustness; instead it results in even lower robustness. We
hope this study could benefit future research in the development of robust
multimodal adaptation methods. The benchmark, code, and dataset used in this
study can be accessed at https://adarobustness.github.io .
Related papers
- UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation [93.38604803625294]
We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG)
We use Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks.
UncertaintyRAG outperforms baselines by 2.03% on LLaMA-2-7B, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-03T17:39:38Z) - A Lost Opportunity for Vision-Language Models: A Comparative Study of Online Test-Time Adaptation for Vision-Language Models [3.0495235326282186]
In deep learning, maintaining robustness against distribution shifts is critical.
This work explores a broad range of possibilities to adapt vision-language foundation models at test-time.
arXiv Detail & Related papers (2024-05-23T18:27:07Z) - Cross-Modal Adapter: Parameter-Efficient Transfer Learning Approach for Vision-Language Models [38.751158173278796]
This work introduces a cross-modal parameter-efficient approach named XMAdapter.
XMAdapter establishes cache models for both text and image modalities.
It then leverages retrieval through visual-language bimodal information to gather clues for inference.
arXiv Detail & Related papers (2024-04-19T02:33:23Z) - Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models [102.72940700598055]
In reasoning tasks, even a minor error can cascade into inaccurate results.
We develop a method that avoids introducing external resources, relying instead on perturbations to the input.
Our training approach randomly masks certain tokens within the chain of thought, a technique we found to be particularly effective for reasoning tasks.
arXiv Detail & Related papers (2024-03-04T16:21:54Z) - Empirical Analysis of Efficient Fine-Tuning Methods for Large
Pre-Trained Language Models [4.096453902709292]
BitFit and adapter modules are compared to standard full model fine-tuning.
The BitFit approach matches full fine-tuning performance across varying amounts of training data.
adapter modules exhibit high variability, with inconsistent gains over default models.
arXiv Detail & Related papers (2024-01-08T17:44:43Z) - In Search of Lost Online Test-time Adaptation: A Survey [40.68806005826287]
This article presents a comprehensive survey of online test-time adaptation (OTTA)
We classify OTTA techniques into three primary categories and benchmark them using a modern backbone, the Vision Transformer (ViT)
Our findings diverge from existing literature, revealing that transformers demonstrate heightened resilience to diverse domain shifts.
arXiv Detail & Related papers (2023-10-31T05:47:33Z) - Learning Representations Robust to Group Shifts and Adversarial Examples [18.742222861886148]
We propose an algorithm that combines adversarial training and group distribution robust optimization to improve representation learning.
Experiments on three image benchmark datasets illustrate that the proposed method achieves superior results on robust metrics without sacrificing much of the standard measures.
arXiv Detail & Related papers (2022-02-18T22:06:25Z) - MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness.
Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions.
We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z) - Unsupervised Robust Domain Adaptation without Source Data [75.85602424699447]
We study the problem of robust domain adaptation in the context of unavailable target labels and source data.
We show a consistent performance improvement of over $10%$ in accuracy against the tested baselines on four benchmark datasets.
arXiv Detail & Related papers (2021-03-26T16:42:28Z) - Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem)
AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient.
Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.