Prompting Diffusion Representations for Cross-Domain Semantic
Segmentation
- URL: http://arxiv.org/abs/2307.02138v1
- Date: Wed, 5 Jul 2023 09:28:25 GMT
- Title: Prompting Diffusion Representations for Cross-Domain Semantic
Segmentation
- Authors: Rui Gong, Martin Danelljan, Han Sun, Julio Delgado Mangas, Luc Van
Gool
- Abstract summary: diffusion-pretraining achieves extraordinary domain generalization results for semantic segmentation.
We introduce a scene prompt and a prompt randomization strategy to help further disentangle the domain-invariant information when training the segmentation head.
- Score: 101.04326113360342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While originally designed for image generation, diffusion models have
recently shown to provide excellent pretrained feature representations for
semantic segmentation. Intrigued by this result, we set out to explore how well
diffusion-pretrained representations generalize to new domains, a crucial
ability for any representation. We find that diffusion-pretraining achieves
extraordinary domain generalization results for semantic segmentation,
outperforming both supervised and self-supervised backbone networks. Motivated
by this, we investigate how to utilize the model's unique ability of taking an
input prompt, in order to further enhance its cross-domain performance. We
introduce a scene prompt and a prompt randomization strategy to help further
disentangle the domain-invariant information when training the segmentation
head. Moreover, we propose a simple but highly effective approach for test-time
domain adaptation, based on learning a scene prompt on the target domain in an
unsupervised manner. Extensive experiments conducted on four synthetic-to-real
and clear-to-adverse weather benchmarks demonstrate the effectiveness of our
approaches. Without resorting to any complex techniques, such as image
translation, augmentation, or rare-class sampling, we set a new
state-of-the-art on all benchmarks. Our implementation will be publicly
available at \url{https://github.com/ETHRuiGong/PTDiffSeg}.
Related papers
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - StylePrompter: Enhancing Domain Generalization with Test-Time Style Priors [39.695604434738186]
In real-world applications, the sample distribution at the inference stage often differs from the one at the training stage.
This paper introduces the style prompt in the language modality to adapt the trained model dynamically.
In particular, we train a style prompter to extract style information of the current image into an embedding in the token embedding space.
Our open space partition of the style token embedding space and the hand-crafted style regularization enable the trained style prompter to handle data from unknown domains effectively.
arXiv Detail & Related papers (2024-08-17T08:35:43Z) - Diffusion Features to Bridge Domain Gap for Semantic Segmentation [2.8616666231199424]
This paper investigates the approach that leverages the sampling and fusion techniques to harness the features of diffusion models efficiently.
By leveraging the strength of text-to-image generation capability, we introduce a new training framework designed to implicitly learn posterior knowledge from it.
arXiv Detail & Related papers (2024-06-02T15:33:46Z) - Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms.
We propose a textbfDomain-Controlled Prompt Learning for the specific domains.
Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z) - CLIP the Gap: A Single Domain Generalization Approach for Object
Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain.
We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts.
We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z) - Learning Domain Invariant Prompt for Vision-Language Models [31.581652862478965]
We propose a novel prompt learning paradigm that directly generates emphdomain invariant prompt that can be generalized to unseen domains, called MetaPrompt.
Our method consistently and significantly outperforms existing methods.
arXiv Detail & Related papers (2022-12-08T11:23:24Z) - Region-Based Semantic Factorization in GANs [67.90498535507106]
We present a highly efficient algorithm to factorize the latent semantics learned by Generative Adversarial Networks (GANs) concerning an arbitrary image region.
Through an appropriately defined generalized Rayleigh quotient, we solve such a problem without any annotations or training.
Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-02-19T17:46:02Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Towards Adaptive Semantic Segmentation by Progressive Feature Refinement [16.40758125170239]
We propose an innovative progressive feature refinement framework, along with domain adversarial learning to boost the transferability of segmentation networks.
As a result, the segmentation models trained with source domain images can be transferred to a target domain without significant performance degradation.
arXiv Detail & Related papers (2020-09-30T04:17:48Z) - Generalizable Model-agnostic Semantic Segmentation via Target-specific
Normalization [24.14272032117714]
We propose a novel domain generalization framework for the generalizable semantic segmentation task.
We exploit the model-agnostic learning to simulate the domain shift problem.
Considering the data-distribution discrepancy between seen source and unseen target domains, we develop the target-specific normalization scheme.
arXiv Detail & Related papers (2020-03-27T09:25:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.