Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning
- URL: http://arxiv.org/abs/2203.04904v1
- Date: Wed, 9 Mar 2022 17:26:53 GMT
- Title: Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning
- Authors: Zhenhailong Wang, Hang Yu, Manling Li, Han Zhao, Heng Ji
- Abstract summary: We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
- Score: 59.38343286807997
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite achieving state-of-the-art zero-shot performance, existing
vision-language models, e.g., CLIP, still fall short of domain-specific
classification tasks, e.g., Fungi Classification. In the context of few-shot
transfer learning, traditional fine-tuning fails to prevent highly expressive
model from exploiting spurious correlations in the training data. On the other
hand, although model-agnostic meta-learning (MAML) presents as a natural
alternative for transfer learning, the expensive computation due to implicit
second-order optimization limits its use in large-scale models and datasets. In
this work we aim to further improve the generalization of existing
vision-language models on unseen tasks via a simple yet efficient fine-tuning
strategy based on uniform task sampling. We term our method as Model-Agnostic
Multitask Fine-tuning (MAMF). Compared with MAML, MAMF discards the bi-level
optimization and uses only first-order gradients, which makes it easily
scalable and computationally efficient. Due to the uniform task sampling
procedure, MAMF consistently outperforms the classical fine-tuning method for
few-shot transfer learning on five benchmark datasets. Empirically, we further
discover that the effectiveness of first-order MAML is highly dependent on the
zero-shot performance of the pretrained model, and our simple algorithm can
outperform first-order MAML on more challenging datasets with low zero-shot
performance.
Related papers
- Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models.
Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z) - FREE: Faster and Better Data-Free Meta-Learning [77.90126669914324]
Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data.
We introduce the Faster and Better Data-Free Meta-Learning framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks.
arXiv Detail & Related papers (2024-05-02T03:43:19Z) - Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning [13.964106147449051]
Existing solutions concentrate on fine-tuning the pre-trained models on conventional image datasets.
We propose a novel and effective framework based on learning Visual Prompts (VPT) in the pre-trained Vision Transformers (ViT)
We demonstrate that our new approximations with semantic information are superior to representative capabilities.
arXiv Detail & Related papers (2024-02-04T04:42:05Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z) - Robust MAML: Prioritization task buffer with adaptive learning process
for model-agnostic meta-learning [15.894925018423665]
Model agnostic meta-learning (MAML) is a popular state-of-the-art meta-learning algorithm.
This paper proposes a more robust MAML based on an adaptive learning scheme and a prioritization task buffer.
Experimental results on meta reinforcement learning environments demonstrate a substantial performance gain.
arXiv Detail & Related papers (2021-03-15T09:34:34Z) - B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic
Meta-Learning [2.9189409618561966]
We propose a Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL algorithm.
We demonstrate the performance of B-MAML using classification and regression tasks, and highlight that training a sparsifying BNN using MAML indeed improves the parameter footprint of the model.
arXiv Detail & Related papers (2021-01-01T09:19:48Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z) - BI-MAML: Balanced Incremental Approach for Meta Learning [9.245355087256314]
We present a novel Balanced Incremental Model Agnostic Meta Learning system (BI-MAML) for learning multiple tasks.
Our method implements a meta-update rule to incrementally adapt its model to new tasks without forgetting old tasks.
Our system performs the meta-updates with only a few-shots and can successfully accomplish them.
arXiv Detail & Related papers (2020-06-12T18:28:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.