Domain Prompt Learning with Quaternion Networks
- URL: http://arxiv.org/abs/2312.08878v1
- Date: Tue, 12 Dec 2023 08:49:39 GMT
- Title: Domain Prompt Learning with Quaternion Networks
- Authors: Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang
- Abstract summary: We propose to leverage domain-specific knowledge from domain-specific foundation models to transfer the robust recognition ability of Vision-Language Models to specialized domains.
We present a hierarchical approach that generates vision prompt features by analyzing intermodal relationships between hierarchical language prompt features and domain-specific vision features.
Our proposed method achieves new state-of-the-art results in prompt learning.
- Score: 49.45309818782329
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt learning has emerged as an effective and data-efficient technique in
large Vision-Language Models (VLMs). However, when adapting VLMs to specialized
domains such as remote sensing and medical imaging, domain prompt learning
remains underexplored. While large-scale domain-specific foundation models can
help tackle this challenge, their concentration on a single vision level makes
it challenging to prompt both vision and language modalities. To overcome this,
we propose to leverage domain-specific knowledge from domain-specific
foundation models to transfer the robust recognition ability of VLMs from
generalized to specialized domains, using quaternion networks. Specifically,
the proposed method involves using domain-specific vision features from
domain-specific foundation models to guide the transformation of generalized
contextual embeddings from the language branch into a specialized space within
the quaternion networks. Moreover, we present a hierarchical approach that
generates vision prompt features by analyzing intermodal relationships between
hierarchical language prompt features and domain-specific vision features. In
this way, quaternion networks can effectively mine the intermodal relationships
in the specific domain, facilitating domain-specific vision-language
contrastive learning. Extensive experiments on domain-specific datasets show
that our proposed method achieves new state-of-the-art results in prompt
learning.
Related papers
- Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification.
We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z) - Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research [44.87702042041601]
We present the Generalized Domain Prompt Learning (GDPL) framework for large-scale Vision-Language Models (VLMs)
GDPL facilitates the transfer of VLMs' robust recognition capabilities from natural vision to specialized domains, without the need for extensive data or resources.
Our framework paves the way for sustainable and inclusive VLM research, transcending the barriers between academia and industry.
arXiv Detail & Related papers (2024-05-14T14:51:12Z) - VLLaVO: Mitigating Visual Gap through LLMs [7.352822795984628]
Cross-domain learning aims at extracting domain-invariant knowledge to reduce the domain shift between training and testing data.
We propose VLLaVO, combining Vision language models and Large Language models as Visual cross-dOmain learners.
arXiv Detail & Related papers (2024-01-06T16:33:39Z) - Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms.
We propose a textbfDomain-Controlled Prompt Learning for the specific domains.
Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z) - Prompt Ensemble Self-training for Open-Vocabulary Domain Adaptation [45.02052030837188]
We study open-vocabulary domain adaptation (OVDA), a new unsupervised domain adaptation framework.
We design a Prompt Ensemble Self-training (PEST) technique that exploits the synergy between vision and language.
PEST outperforms the state-of-the-art consistently across 10 image recognition tasks.
arXiv Detail & Related papers (2023-06-29T03:39:35Z) - INDIGO: Intrinsic Multimodality for Domain Generalization [26.344372409315177]
We study how multimodal information can be leveraged in an "intrinsic" way to make systems generalize under unseen domains.
We propose IntriNsic multimodality for DomaIn GeneralizatiOn (INDIGO)
arXiv Detail & Related papers (2022-06-13T05:41:09Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z) - Domain Conditioned Adaptation Network [90.63261870610211]
We propose a Domain Conditioned Adaptation Network (DCAN) to excite distinct convolutional channels with a domain conditioned channel attention mechanism.
This is the first work to explore the domain-wise convolutional channel activation for deep DA networks.
arXiv Detail & Related papers (2020-05-14T04:23:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.