Unknown Prompt, the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization
- URL: http://arxiv.org/abs/2404.00710v1
- Date: Sun, 31 Mar 2024 15:03:31 GMT
- Title: Unknown Prompt, the only Lacuna: Unveiling CLIP's Potential for Open Domain Generalization
- Authors: Mainak Singha, Ankit Jha, Shirsha Bose, Ashwin Nair, Moloud Abdar, Biplab Banerjee,
- Abstract summary: We introduce ODG-CLIP, harnessing the semantic prowess of the vision-language model, CLIP.
We conceptualize ODG as a multi-class classification challenge encompassing both known and novel categories.
We infuse images with class-discriminative knowledge derived from the prompt space to augment the fidelity of CLIP's visual embeddings.
- Score: 12.126495847808803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We delve into Open Domain Generalization (ODG), marked by domain and category shifts between training's labeled source and testing's unlabeled target domains. Existing solutions to ODG face limitations due to constrained generalizations of traditional CNN backbones and errors in detecting target open samples in the absence of prior knowledge. Addressing these pitfalls, we introduce ODG-CLIP, harnessing the semantic prowess of the vision-language model, CLIP. Our framework brings forth three primary innovations: Firstly, distinct from prevailing paradigms, we conceptualize ODG as a multi-class classification challenge encompassing both known and novel categories. Central to our approach is modeling a unique prompt tailored for detecting unknown class samples, and to train this, we employ a readily accessible stable diffusion model, elegantly generating proxy images for the open class. Secondly, aiming for domain-tailored classification (prompt) weights while ensuring a balance of precision and simplicity, we devise a novel visual stylecentric prompt learning mechanism. Finally, we infuse images with class-discriminative knowledge derived from the prompt space to augment the fidelity of CLIP's visual embeddings. We introduce a novel objective to safeguard the continuity of this infused semantic intel across domains, especially for the shared classes. Through rigorous testing on diverse datasets, covering closed and open-set DG contexts, ODG-CLIP demonstrates clear supremacy, consistently outpacing peers with performance boosts between 8%-16%. Code will be available at https://github.com/mainaksingha01/ODG-CLIP.
Related papers
- In the Era of Prompt Learning with Vision-Language Models [1.060608983034705]
We introduce textscStyLIP, a novel domain-agnostic prompt learning strategy for Domain Generalization (DG)
StyLIP disentangles visual style and content in CLIPs vision encoder by using style projectors to learn domain-specific prompt tokens.
We also propose AD-CLIP for unsupervised domain adaptation (DA), leveraging CLIPs frozen vision backbone.
arXiv Detail & Related papers (2024-11-07T17:31:21Z) - CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery [9.505699498746976]
Generalized Category Discovery (GCD) is a tool to cluster unlabeled samples of known and novel classes.
We present Across Domain Generalized Category Discovery (AD-GCD) and bring forth CDAD-NET as a remedy.
CDAD-NET is architected to synchronize potential known class samples across both the labeled (source) and unlabeled (target) datasets.
Experimentally, CDAD-NET eclipses existing literature with a performance increment of 8-15% on three AD-GCD benchmarks we present.
arXiv Detail & Related papers (2024-04-08T10:05:24Z) - Learning Class and Domain Augmentations for Single-Source Open-Domain
Generalization [15.338029608652777]
Single-source open-domain generalization (SS-ODG) addresses the challenge of labeled source domains with supervision during training and unlabeled novel target domains during testing.
We propose a novel framework called SODG-Net that simultaneously synthesizes novel domains and generates pseudo-open samples.
Our approach enhances generalization by diversifying the styles of known class samples using a novel metric criterion.
arXiv Detail & Related papers (2023-11-05T08:53:07Z) - Activate and Reject: Towards Safe Domain Generalization under Category
Shift [71.95548187205736]
We study a practical problem of Domain Generalization under Category Shift (DGCS)
It aims to simultaneously detect unknown-class samples and classify known-class samples in the target domains.
Compared to prior DG works, we face two new challenges: 1) how to learn the concept of unknown'' during training with only source known-class samples, and 2) how to adapt the source-trained model to unseen environments.
arXiv Detail & Related papers (2023-10-07T07:53:12Z) - Global Knowledge Calibration for Fast Open-Vocabulary Segmentation [124.74256749281625]
We introduce a text diversification strategy that generates a set of synonyms for each training category.
We also employ a text-guided knowledge distillation method to preserve the generalizable knowledge of CLIP.
Our proposed model achieves robust generalization performance across various datasets.
arXiv Detail & Related papers (2023-03-16T09:51:41Z) - Upcycling Models under Domain and Category Shift [95.22147885947732]
We introduce an innovative global and local clustering learning technique (GLC)
We design a novel, adaptive one-vs-all global clustering algorithm to achieve the distinction across different target classes.
Remarkably, in the most challenging open-partial-set DA scenario, GLC outperforms UMAD by 14.8% on the VisDA benchmark.
arXiv Detail & Related papers (2023-03-13T13:44:04Z) - Self-Paced Learning for Open-Set Domain Adaptation [50.620824701934]
Traditional domain adaptation methods presume that the classes in the source and target domains are identical.
Open-set domain adaptation (OSDA) addresses this limitation by allowing previously unseen classes in the target domain.
We propose a novel framework based on self-paced learning to distinguish common and unknown class samples.
arXiv Detail & Related papers (2023-03-10T14:11:09Z) - CLIP the Gap: A Single Domain Generalization Approach for Object
Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain.
We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts.
We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z) - Open Set Domain Recognition via Attention-Based GCN and Semantic
Matching Optimization [8.831857715361624]
This work presents an end-to-end model based on attention-based GCN and semantic matching optimization.
Experimental results validate that the proposed model not only has superiority on recognizing the images of known and unknown classes, but also can adapt to various openness of the target domain.
arXiv Detail & Related papers (2021-05-11T12:05:36Z) - Exploring Category-Agnostic Clusters for Open-Set Domain Adaptation [138.29273453811945]
We present Self-Ensembling with Category-agnostic Clusters (SE-CC) -- a novel architecture that steers domain adaptation with category-agnostic clusters in target domain.
clustering is performed over all the unlabeled target samples to obtain the category-agnostic clusters, which reveal the underlying data space structure peculiar to target domain.
arXiv Detail & Related papers (2020-06-11T16:19:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.