Related papers: Efficient Test-Time Adaptation of Vision-Language Models

Efficient Test-Time Adaptation of Vision-Language Models

URL: http://arxiv.org/abs/2403.18293v1
Date: Wed, 27 Mar 2024 06:37:51 GMT
Title: Efficient Test-Time Adaptation of Vision-Language Models
Authors: Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, Eric Xing,
Abstract summary: Test-time adaptation with pre-trained vision-language models has attracted increasing attention for tackling distribution shifts during the test time. We design TDA, a training-free dynamic adapter that enables effective and efficient test-time adaptation with vision-language models.
Score: 58.3646257833533
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Test-time adaptation with pre-trained vision-language models has attracted increasing attention for tackling distribution shifts during the test time. Though prior studies have achieved very promising performance, they involve intensive computation which is severely unaligned with test-time adaptation. We design TDA, a training-free dynamic adapter that enables effective and efficient test-time adaptation with vision-language models. TDA works with a lightweight key-value cache that maintains a dynamic queue with few-shot pseudo labels as values and the corresponding test-sample features as keys. Leveraging the key-value cache, TDA allows adapting to test data gradually via progressive pseudo label refinement which is super-efficient without incurring any backpropagation. In addition, we introduce negative pseudo labeling that alleviates the adverse impact of pseudo label noises by assigning pseudo labels to certain negative classes when the model is uncertain about its pseudo label predictions. Extensive experiments over two benchmarks demonstrate TDA's superior effectiveness and efficiency as compared with the state-of-the-art. The code has been released in \url{https://kdiaaa.github.io/tda/}.

Related papers

Efficient and Context-Aware Label Propagation for Zero-/Few-Shot Training-Free Adaptation of Vision-Language Model [41.55165760439727]
Vision-language models (VLMs) have revolutionized machine learning by leveraging large pre-trained models to tackle various downstream tasks. We propose a graph-based approach for label-efficient adaptation and inference. Our method dynamically constructs a graph over text prompts, few-shot examples, and test samples, using label propagation for inference without task-specific tuning.
arXiv Detail & Related papers (2024-12-24T09:15:00Z)
Test-time Alignment-Enhanced Adapter for Vision-Language Models [6.549059375031384]
Test-time adaptation with pre-trained vision-language models (VLMs) has attracted increasing attention for tackling the issue of distribution shift during the test phase. We introduce a new approach called Test-time Alignment-Enhanced Adapter (TAEA), which trains an adapter with test samples to adjust text features during the test phase.
arXiv Detail & Related papers (2024-11-24T06:43:38Z)
DOTA: Distributional Test-Time Adaptation of Vision-Language Models [52.98590762456236]
Training-free test-time dynamic adapter (TDA) is a promising approach to address this issue. We propose a simple yet effective method for DistributiOnal Test-time Adaptation (Dota) Dota continually estimates the distributions of test samples, allowing the model to continually adapt to the deployment environment.
arXiv Detail & Related papers (2024-09-28T15:03:28Z)
Few Clicks Suffice: Active Test-Time Adaptation for Semantic Segmentation [14.112999441288615]
Test-time adaptation (TTA) adapts pre-trained models during inference using unlabeled test data. There is still a significant performance gap between the TTA approaches and their supervised counterparts. We propose ATASeg framework, which consists of two parts, i.e., model adapter and label annotator.
arXiv Detail & Related papers (2023-12-04T12:16:02Z)
Noise-Tolerant Few-Shot Unsupervised Adapter for Vision-Language Models [8.59772105902647]
We design NtUA, a Noise-tolerant Unsupervised Adapter that allows the learning of effective target models with few unlabelled target samples. NtUA works as a key-value cache that formulates visual features and predicted pseudo-labels of the few unlabelled target samples as key-value pairs. NtUA achieves superior performance consistently across multiple widely adopted benchmarks.
arXiv Detail & Related papers (2023-09-26T13:35:31Z)
Rethinking Precision of Pseudo Label: Test-Time Adaptation via Complementary Learning [10.396596055773012]
We propose a novel complementary learning approach to enhance test-time adaptation. In test-time adaptation tasks, information from the source domain is typically unavailable. We highlight that the risk function of complementary labels agrees with their Vanilla loss formula.
arXiv Detail & Related papers (2023-01-15T03:36:33Z)
CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time. We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z)
Contrastive Test-Time Adaptation [83.73506803142693]
We propose a novel way to leverage self-supervised contrastive learning to facilitate target feature learning. We produce pseudo labels online and refine them via soft voting among their nearest neighbors in the target feature space. Our method, AdaContrast, achieves state-of-the-art performance on major benchmarks.
arXiv Detail & Related papers (2022-04-21T19:17:22Z)
Efficient Test-Time Model Adaptation without Forgetting [60.36499845014649]
Test-time adaptation seeks to tackle potential distribution shifts between training and testing data. We propose an active sample selection criterion to identify reliable and non-redundant samples. We also introduce a Fisher regularizer to constrain important model parameters from drastic changes.
arXiv Detail & Related papers (2022-04-06T06:39:40Z)
Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models. Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.