Transitive Vision-Language Prompt Learning for Domain Generalization
- URL: http://arxiv.org/abs/2404.18758v1
- Date: Mon, 29 Apr 2024 14:56:11 GMT
- Title: Transitive Vision-Language Prompt Learning for Domain Generalization
- Authors: Liyuan Wang, Yan Jin, Zhen Chen, Jinlin Wu, Mengke Li, Yang Lu, Hanzi Wang,
- Abstract summary: The vision-language pre-training has enabled deep models to make a huge step forward in generalizing across unseen domains.
However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability.
- Score: 41.484858946789664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The vision-language pre-training has enabled deep models to make a huge step forward in generalizing across unseen domains. The recent learning method based on the vision-language pre-training model is a great tool for domain generalization and can solve this problem to a large extent. However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability, which are crucial in current DG problems. However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability, which are crucial in current DG problems. In this paper, we introduce a novel prompt learning strategy that leverages deep vision prompts to address domain invariance while utilizing language prompts to ensure class separability, coupled with adaptive weighting mechanisms to balance domain invariance and class separability. Extensive experiments demonstrate that deep vision prompts effectively extract domain-invariant features, significantly improving the generalization ability of deep models and achieving state-of-the-art performance on three datasets.
Related papers
- FEED: Fairness-Enhanced Meta-Learning for Domain Generalization [13.757379847454372]
Generalizing to out-of-distribution data while aware of model fairness is a significant and challenging problem in meta-learning.
This paper introduces an approach to fairness-aware meta-learning that significantly enhances domain generalization capabilities.
arXiv Detail & Related papers (2024-11-02T17:34:33Z) - Robust Domain Generalization for Multi-modal Object Recognition [14.128747255526012]
In multi-label classification, machine learning encounters the challenge of domain generalization when handling tasks with differing distributions from the training data.
Recent advancements in vision-language pre-training leverage supervision from extensive visual-language pairs, enabling learning across diverse domains.
This paper proposes solutions by inferring the actual loss, broadening evaluations to larger vision-language backbones, and introducing Mixup-CLIPood.
arXiv Detail & Related papers (2024-08-11T17:13:21Z) - Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird's Eye View Segmentation for Connected and Autonomous Driving [49.03947018718156]
We propose a unified domain generalization framework to be utilized during the training and inference stages of collaborative perception.
We also introduce an intra-system domain alignment mechanism to reduce or potentially eliminate the domain discrepancy among connected and autonomous vehicles.
arXiv Detail & Related papers (2023-11-28T12:52:49Z) - Multi-Scale and Multi-Layer Contrastive Learning for Domain Generalization [5.124256074746721]
We argue that the generalization ability of deep convolutional neural networks can be improved by taking advantage of multi-layer and multi-scaled representations of the network.
We introduce a framework that aims at improving domain generalization of image classifiers by combining both low-level and high-level features at multiple scales.
We show that our model is able to surpass the performance of previous DG methods and consistently produce competitive and state-of-the-art results in all datasets.
arXiv Detail & Related papers (2023-08-28T08:54:27Z) - Trade-off between reconstruction loss and feature alignment for domain
generalization [30.459247038765568]
Domain generalization (DG) is a branch of transfer learning that aims to train the learning models on several seen domains and subsequently apply these pre-trained models to other unseen (unknown but related) domains.
To deal with challenging settings in DG where both data and label of the unseen domain are not available at training time, the most common approach is to design the classifiers based on the domain-invariant representation features.
Contrary to popular belief, we show that designing classifiers based on invariant representation features alone is necessary but insufficient in DG.
arXiv Detail & Related papers (2022-10-26T19:40:25Z) - A Style and Semantic Memory Mechanism for Domain Generalization [108.98041306507372]
Intra-domain style invariance is of pivotal importance in improving the efficiency of domain generalization.
We propose a novel "jury" mechanism, which is particularly effective in learning useful semantic feature commonalities among domains.
Our proposed framework surpasses the state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2021-12-14T16:23:24Z) - Improving Transferability of Domain Adaptation Networks Through Domain
Alignment Layers [1.3766148734487902]
Multi-source unsupervised domain adaptation (MSDA) aims at learning a predictor for an unlabeled domain by assigning weak knowledge from a bag of source models.
We propose to embed Multi-Source version of DomaIn Alignment Layers (MS-DIAL) at different levels of the predictor.
Our approach can improve state-of-the-art MSDA methods, yielding relative gains of up to +30.64% on their classification accuracies.
arXiv Detail & Related papers (2021-09-06T18:41:19Z) - Coarse to Fine: Domain Adaptive Crowd Counting via Adversarial Scoring
Network [58.05473757538834]
This paper proposes a novel adversarial scoring network (ASNet) to bridge the gap across domains from coarse to fine granularity.
Three sets of migration experiments show that the proposed methods achieve state-of-the-art counting performance.
arXiv Detail & Related papers (2021-07-27T14:47:24Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z) - Adaptively-Accumulated Knowledge Transfer for Partial Domain Adaptation [66.74638960925854]
Partial domain adaptation (PDA) deals with a realistic and challenging problem when the source domain label space substitutes the target domain.
We propose an Adaptively-Accumulated Knowledge Transfer framework (A$2$KT) to align the relevant categories across two domains.
arXiv Detail & Related papers (2020-08-27T00:53:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.