A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
- URL: http://arxiv.org/abs/2410.11686v1
- Date: Tue, 15 Oct 2024 15:22:30 GMT
- Title: A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
- Authors: Kun Ding, Ying Wang, Gaofeng Meng, Shiming Xiang,
- Abstract summary: Key challenge to address under the condition of limited training data is how to fine-tune pre-trained vision-language models in a parameter-efficient manner.
This paper proposes a unified computational framework to integrate existing methods together, identify their nature and support in-depth comparison.
As a demonstration, we extend existing methods by modeling inter-class correlation between representers in reproducing kernel Hilbert space (RKHS)
- Score: 38.84662767814454
- License:
- Abstract: The advent of pre-trained vision-language foundation models has revolutionized the field of zero/few-shot (i.e., low-shot) image recognition. The key challenge to address under the condition of limited training data is how to fine-tune pre-trained vision-language models in a parameter-efficient manner. Previously, numerous approaches tackling this challenge have been proposed. Meantime, a few survey papers are also published to summarize these works. However, there still lacks a unified computational framework to integrate existing methods together, identify their nature and support in-depth comparison. As such, this survey paper first proposes a unified computational framework from the perspective of Representer Theorem and then derives many of the existing methods by specializing this framework. Thereafter, a comparative analysis is conducted to uncover the differences and relationships between existing methods. Based on the analyses, some possible variants to improve the existing works are presented. As a demonstration, we extend existing methods by modeling inter-class correlation between representers in reproducing kernel Hilbert space (RKHS), which is implemented by exploiting the closed-form solution of kernel ridge regression. Extensive experiments on 11 datasets are conducted to validate the effectiveness of this method. Toward the end of this paper, we discuss the limitations and provide further research directions.
Related papers
- Constrained Reinforcement Learning with Average Reward Objective: Model-Based and Model-Free Algorithms [34.593772931446125]
monograph focuses on the exploration of various model-based and model-free approaches for Constrained within the context of average reward Markov Decision Processes (MDPs)
The primal-dual policy gradient-based algorithm is explored as a solution for constrained MDPs.
arXiv Detail & Related papers (2024-06-17T12:46:02Z) - Detecting Statements in Text: A Domain-Agnostic Few-Shot Solution [1.3654846342364308]
State-of-the-art approaches usually involve fine-tuning models on large annotated datasets, which are costly to produce.
We propose and release a qualitative and versatile few-shot learning methodology as a common paradigm for any claim-based textual classification task.
We illustrate this methodology in the context of three tasks: climate change contrarianism detection, topic/stance classification and depression-relates symptoms detection.
arXiv Detail & Related papers (2024-05-09T12:03:38Z) - Differentiable Retrieval Augmentation via Generative Language Modeling
for E-commerce Query Intent Classification [8.59563091603226]
We propose Differentiable Retrieval Augmentation via Generative lANguage modeling(Dragan) to address this problem by a novel differentiable reformulation.
We demonstrate the effectiveness of our proposed method on a challenging NLP task in e-commerce search, namely query intent classification.
arXiv Detail & Related papers (2023-08-18T05:05:35Z) - Deep Generative Models for Decision-Making and Control [4.238809918521607]
The dual purpose of this thesis is to study the reasons for these shortcomings and to propose solutions for the uncovered problems.
We highlight how inference techniques from the contemporary generative modeling toolbox, including beam search, can be reinterpreted as viable planning strategies for reinforcement learning problems.
arXiv Detail & Related papers (2023-06-15T01:54:30Z) - CREST: A Joint Framework for Rationalization and Counterfactual Text
Generation [5.606679908174783]
We introduce CREST (ContRastive Edits with Sparse raTionalization), a framework for selective rationalization and counterfactual text generation.
CREST generates valid counterfactuals that are more natural than those produced by previous methods.
New loss function that leverages CREST counterfactuals to regularize selective rationales improves both model robustness and rationale quality.
arXiv Detail & Related papers (2023-05-26T16:34:58Z) - Diffusion Action Segmentation [63.061058214427085]
We propose a novel framework via denoising diffusion models, which shares the same inherent spirit of such iterative refinement.
In this framework, action predictions are iteratively generated from random noise with input video features as conditions.
arXiv Detail & Related papers (2023-03-31T10:53:24Z) - IRGen: Generative Modeling for Image Retrieval [82.62022344988993]
In this paper, we present a novel methodology, reframing image retrieval as a variant of generative modeling.
We develop our model, dubbed IRGen, to address the technical challenge of converting an image into a concise sequence of semantic units.
Our model achieves state-of-the-art performance on three widely-used image retrieval benchmarks and two million-scale datasets.
arXiv Detail & Related papers (2023-03-17T17:07:36Z) - Attentional Prototype Inference for Few-Shot Segmentation [128.45753577331422]
We propose attentional prototype inference (API), a probabilistic latent variable framework for few-shot segmentation.
We define a global latent variable to represent the prototype of each object category, which we model as a probabilistic distribution.
We conduct extensive experiments on four benchmarks, where our proposal obtains at least competitive and often better performance than state-of-the-art prototype-based methods.
arXiv Detail & Related papers (2021-05-14T06:58:44Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - How Far are We from Effective Context Modeling? An Exploratory Study on
Semantic Parsing in Context [59.13515950353125]
We present a grammar-based decoding semantic parsing and adapt typical context modeling methods on top of it.
We evaluate 13 context modeling methods on two large cross-domain datasets, and our best model achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-02-03T11:28:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.