A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models
- URL: http://arxiv.org/abs/2312.12730v2
- Date: Mon, 25 Mar 2024 18:49:52 GMT
- Title: A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models
- Authors: Julio Silva-RodrÃguez, Sina Hajimiri, Ismail Ben Ayed, Jose Dolz,
- Abstract summary: We show that state-of-the-artETL approaches exhibit strong performance only in narrowly-defined experimental setups.
We propose a CLass-Adaptive linear Probe (CLAP) objective, whose balancing term is optimized via an adaptation of the general Augmented Lagrangian method.
- Score: 19.17722702457403
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient transfer learning (ETL) is receiving increasing attention to adapt large pre-trained language-vision models on downstream tasks with a few labeled samples. While significant progress has been made, we reveal that state-of-the-art ETL approaches exhibit strong performance only in narrowly-defined experimental setups, and with a careful adjustment of hyperparameters based on a large corpus of labeled samples. In particular, we make two interesting, and surprising empirical observations. First, to outperform a simple Linear Probing baseline, these methods require to optimize their hyper-parameters on each target task. And second, they typically underperform -- sometimes dramatically -- standard zero-shot predictions in the presence of distributional drifts. Motivated by the unrealistic assumptions made in the existing literature, i.e., access to a large validation set and case-specific grid-search for optimal hyperparameters, we propose a novel approach that meets the requirements of real-world scenarios. More concretely, we introduce a CLass-Adaptive linear Probe (CLAP) objective, whose balancing term is optimized via an adaptation of the general Augmented Lagrangian method tailored to this context. We comprehensively evaluate CLAP on a broad span of datasets and scenarios, demonstrating that it consistently outperforms SoTA approaches, while yet being a much more efficient alternative.
Related papers
- MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning [22.950914612765494]
Fine-tuning large language models (LLMs) has achieved remarkable performance across various natural language processing tasks.
Memory-efficient Zeroth-order (MeZO) methods attempt to fine-tune LLMs using only forward passes, thereby avoiding the need for a backpropagation graph.
We propose the Adaptive Zeroth-order-Train Adaption (AdaZeta) framework, specifically designed to improve the performance and convergence of the ZO methods.
arXiv Detail & Related papers (2024-06-26T04:33:13Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Optimizing Hyperparameters with Conformal Quantile Regression [7.316604052864345]
We propose to leverage conformalized quantile regression which makes minimal assumptions about the observation noise.
This translates to quicker HPO convergence on empirical benchmarks.
arXiv Detail & Related papers (2023-05-05T15:33:39Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - Fine-grained Retrieval Prompt Tuning [149.9071858259279]
Fine-grained Retrieval Prompt Tuning steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompt and feature adaptation.
Our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.
arXiv Detail & Related papers (2022-07-29T04:10:04Z) - Adaptive Fine-Grained Predicates Learning for Scene Graph Generation [122.4588401267544]
General Scene Graph Generation (SGG) models tend to predict head predicates and re-balancing strategies prefer tail categories.
We propose an Adaptive Fine-Grained Predicates Learning (FGPL-A) which aims at differentiating hard-to-distinguish predicates for SGG.
Our proposed model-agnostic strategy significantly boosts performance of benchmark models on VG-SGG and GQA-SGG datasets by up to 175% and 76% on Mean Recall@100, achieving new state-of-the-art performance.
arXiv Detail & Related papers (2022-07-11T03:37:57Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Parameter-free Online Test-time Adaptation [19.279048049267388]
We show how test-time adaptation methods fare for a number of pre-trained models on a variety of real-world scenarios.
We propose a particularly "conservative" approach, which addresses the problem with a Laplacian Adjusted Maximum Estimation (LAME)
Our approach exhibits a much higher average accuracy across scenarios than existing methods, while being notably faster and have a much lower memory footprint.
arXiv Detail & Related papers (2022-01-15T00:29:16Z) - Improving Hyperparameter Optimization by Planning Ahead [3.8673630752805432]
We propose a novel transfer learning approach, defined within the context of model-based reinforcement learning.
We propose a new variant of model predictive control which employs a simple look-ahead strategy as a policy.
Our experiments on three meta-datasets comparing to state-of-the-art HPO algorithms show that the proposed method can outperform all baselines.
arXiv Detail & Related papers (2021-10-15T11:46:14Z) - Bayesian Few-Shot Classification with One-vs-Each P\'olya-Gamma
Augmented Gaussian Processes [7.6146285961466]
Few-shot classification (FSC) is an important step on the path toward human-like machine learning.
We propose a novel combination of P'olya-Gamma augmentation and the one-vs-each softmax approximation that allows us to efficiently marginalize over functions rather than model parameters.
We demonstrate improved accuracy and uncertainty quantification on both standard few-shot classification benchmarks and few-shot domain transfer tasks.
arXiv Detail & Related papers (2020-07-20T19:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.