Dual-Path Stable Soft Prompt Generation for Domain Generalization
- URL: http://arxiv.org/abs/2505.18770v1
- Date: Sat, 24 May 2025 16:20:06 GMT
- Title: Dual-Path Stable Soft Prompt Generation for Domain Generalization
- Authors: Yuedi Zhang, Shuanghao Bai, Wanqi Zhou, Zhirong Luan, Badong Chen,
- Abstract summary: Domain generalization (DG) aims to learn a model using data from one or multiple related but distinct source domains.<n> prompt tuning has emerged as an effective generalization strategy.<n>It often struggles to capture domain-specific features due to its reliance on manually or fixed prompt inputs.<n>Recent prompt generation methods have addressed this limitation by dynamically generating instance-specific and domain-specific prompts.
- Score: 13.957351735394683
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Domain generalization (DG) aims to learn a model using data from one or multiple related but distinct source domains that can generalize well to unseen out-of-distribution target domains. Inspired by the success of large pre-trained vision-language models (VLMs), prompt tuning has emerged as an effective generalization strategy. However, it often struggles to capture domain-specific features due to its reliance on manually or fixed prompt inputs. Recently, some prompt generation methods have addressed this limitation by dynamically generating instance-specific and domain-specific prompts for each input, enriching domain information and demonstrating potential for enhanced generalization. Through further investigation, we identify a notable issue in existing prompt generation methods: the same input often yields significantly different and suboptimal prompts across different random seeds, a phenomenon we term Prompt Variability. To address this, we introduce negative learning into the prompt generation process and propose Dual-Path Stable Soft Prompt Generation (DPSPG), a transformer-based framework designed to improve both the stability and generalization of prompts. Specifically, DPSPG incorporates a complementary prompt generator to produce negative prompts, thereby reducing the risk of introducing misleading information. Both theoretical and empirical analyses demonstrate that negative learning leads to more robust and effective prompts by increasing the effective margin and reducing the upper bound of the gradient norm. Extensive experiments on five DG benchmark datasets show that DPSPG consistently outperforms state-of-the-art methods while maintaining prompt stability.
Related papers
- Text-Driven Causal Representation Learning for Source-Free Domain Generalization [82.75041792888274]
We propose TDCRL, the first method to integrate causal inference into the source-free domain generalization setting.<n>Our approach offers a clear and effective mechanism to achieve robust, domain-invariant features, ensuring robust generalization.
arXiv Detail & Related papers (2025-07-14T06:20:42Z) - Generative Classifier for Domain Generalization [84.92088101715116]
Domain generalization aims to the generalizability of computer vision models toward distribution shifts.<n>We propose Generative-driven Domain Generalization (GCDG)<n>GCDG consists of three key modules: Heterogeneity Learning(HLC), Spurious Correlation(SCB), and Diverse Component Balancing(DCB)
arXiv Detail & Related papers (2025-04-03T04:38:33Z) - CAT: Class Aware Adaptive Thresholding for Semi-Supervised Domain Generalization [0.989976359821412]
Domain Generalization seeks to transfer knowledge from source domains to unseen target domains, even in the presence of domain shifts.<n>We propose a novel method, CAT, which leverages semi-supervised learning with limited labeled data to achieve competitive generalization performance under domain shifts.<n>Our approach uses flexible thresholding to generate high-quality pseudo-labels with higher class diversity while refining noisy pseudo-labels to improve their reliability.
arXiv Detail & Related papers (2024-12-11T15:47:01Z) - Soft Prompt Generation for Domain Generalization [13.957351735394683]
Large pre-trained vision language models (VLMs) have shown impressive zero-shot ability on downstream tasks with manually designed prompt.
To further adapt VLMs to downstream tasks, soft prompt is proposed to replace manually designed prompt.
arXiv Detail & Related papers (2024-04-30T06:33:07Z) - Uncertainty-guided Contrastive Learning for Single Source Domain Generalisation [15.907643838530655]
In this paper, we introduce a novel model referred to as Contrastive Uncertainty Domain Generalisation Network (CUDGNet)
The key idea is to augment the source capacity in both input and label spaces through the fictitious domain generator.
Our method also provides efficient uncertainty estimation at inference time from a single forward pass through the generator subnetwork.
arXiv Detail & Related papers (2024-03-12T10:47:45Z) - Memory-Efficient Prompt Tuning for Incremental Histopathology
Classification [69.46798702300042]
We present a memory-efficient prompt tuning framework to cultivate model generalization potential in economical memory cost.
We have extensively evaluated our framework with two histopathology tasks, i.e., breast cancer metastasis classification and epithelium-stroma tissue classification.
arXiv Detail & Related papers (2024-01-22T03:24:45Z) - HCVP: Leveraging Hierarchical Contrastive Visual Prompt for Domain
Generalization [69.33162366130887]
Domain Generalization (DG) endeavors to create machine learning models that excel in unseen scenarios by learning invariant features.
We introduce a novel method designed to supplement the model with domain-level and task-specific characteristics.
This approach aims to guide the model in more effectively separating invariant features from specific characteristics, thereby boosting the generalization.
arXiv Detail & Related papers (2024-01-18T04:23:21Z) - NormAUG: Normalization-guided Augmentation for Domain Generalization [60.159546669021346]
We propose a simple yet effective method called NormAUG (Normalization-guided Augmentation) for deep learning.
Our method introduces diverse information at the feature level and improves the generalization of the main path.
In the test stage, we leverage an ensemble strategy to combine the predictions from the auxiliary path of our model, further boosting performance.
arXiv Detail & Related papers (2023-07-25T13:35:45Z) - Randomized Adversarial Style Perturbations for Domain Generalization [49.888364462991234]
We propose a novel domain generalization technique, referred to as Randomized Adversarial Style Perturbation (RASP)
The proposed algorithm perturbs the style of a feature in an adversarial direction towards a randomly selected class, and makes the model learn against being misled by the unexpected styles observed in unseen target domains.
We evaluate the proposed algorithm via extensive experiments on various benchmarks and show that our approach improves domain generalization performance, especially in large-scale benchmarks.
arXiv Detail & Related papers (2023-04-04T17:07:06Z) - Learning Domain Invariant Prompt for Vision-Language Models [31.581652862478965]
We propose a novel prompt learning paradigm that directly generates emphdomain invariant prompt that can be generalized to unseen domains, called MetaPrompt.
Our method consistently and significantly outperforms existing methods.
arXiv Detail & Related papers (2022-12-08T11:23:24Z) - Learning Domain Invariant Representations for Generalizable Person
Re-Identification [71.35292121563491]
Generalizable person Re-Identification (ReID) has attracted growing attention in recent computer vision community.
We introduce causality into person ReID and propose a novel generalizable framework, named Domain Invariant Representations for generalizable person Re-Identification (DIR-ReID)
arXiv Detail & Related papers (2021-03-29T18:59:48Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.