Related papers: Diffusion Features to Bridge Domain Gap for Semantic Segmentation

Diffusion Features to Bridge Domain Gap for Semantic Segmentation

URL: http://arxiv.org/abs/2406.00777v2
Date: Thu, 21 Nov 2024 09:10:23 GMT
Title: Diffusion Features to Bridge Domain Gap for Semantic Segmentation
Authors: Yuxiang Ji, Boyong He, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu,
Abstract summary: This paper investigates the approach that leverages the sampling and fusion techniques to harness the features of diffusion models efficiently. By leveraging the strength of text-to-image generation capability, we introduce a new training framework designed to implicitly learn posterior knowledge from it.
Score: 2.8616666231199424
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pre-trained diffusion models have demonstrated remarkable proficiency in synthesizing images across a wide range of scenarios with customizable prompts, indicating their effective capacity to capture universal features. Motivated by this, our study delves into the utilization of the implicit knowledge embedded within diffusion models to address challenges in cross-domain semantic segmentation. This paper investigates the approach that leverages the sampling and fusion techniques to harness the features of diffusion models efficiently. We propose DIffusion Feature Fusion (DIFF) as a backbone use for extracting and integrating effective semantic representations through the diffusion process. By leveraging the strength of text-to-image generation capability, we introduce a new training framework designed to implicitly learn posterior knowledge from it. Through rigorous evaluation in the contexts of domain generalization semantic segmentation, we establish that our methodology surpasses preceding approaches in mitigating discrepancies across distinct domains and attains the state-of-the-art (SOTA) benchmark.

Related papers

Boundless Across Domains: A New Paradigm of Adaptive Feature and Cross-Attention for Domain Generalization in Medical Image Segmentation [1.93061220186624]
Domain-invariant representation learning is a powerful method for domain generalization. Previous approaches face challenges such as high computational demands, training instability, and limited effectiveness with high-dimensional data. We propose an Adaptive Feature Blending (AFB) method that generates out-of-distribution samples while exploring the in-distribution space.
arXiv Detail & Related papers (2024-11-22T12:06:24Z)
Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation [56.87049651707208]
Few-shot Semantic has evolved into In-context tasks, morphing into a crucial element in assessing generalist segmentation models. Our initial focus lies in understanding how to facilitate interaction between the query image and the support image, resulting in the proposal of a KV fusion method within the self-attention framework. Based on our analysis, we establish a simple and effective framework named DiffewS, maximally retaining the original Latent Diffusion Model's generative framework.
arXiv Detail & Related papers (2024-10-03T10:33:49Z)
Bridging Generative and Discriminative Models for Unified Visual Perception with Diffusion Priors [56.82596340418697]
We propose a simple yet effective framework comprising a pre-trained Stable Diffusion (SD) model containing rich generative priors, a unified head (U-head) capable of integrating hierarchical representations, and an adapted expert providing discriminative priors. Comprehensive investigations unveil potential characteristics of Vermouth, such as varying granularity of perception concealed in latent variables at distinct time steps and various U-net stages. The promising results demonstrate the potential of diffusion models as formidable learners, establishing their significance in furnishing informative and robust visual representations.
arXiv Detail & Related papers (2024-01-29T10:36:57Z)
Unified Domain Adaptive Semantic Segmentation [96.74199626935294]
Unsupervised Adaptive Domain Semantic (UDA-SS) aims to transfer the supervision from a labeled source domain to an unlabeled target domain. We propose a Quad-directional Mixup (QuadMix) method, characterized by tackling distinct point attributes and feature inconsistencies. Our method outperforms the state-of-the-art works by large margins on four challenging UDA-SS benchmarks.
arXiv Detail & Related papers (2023-11-22T09:18:49Z)
Prompting Diffusion Representations for Cross-Domain Semantic Segmentation [101.04326113360342]
diffusion-pretraining achieves extraordinary domain generalization results for semantic segmentation. We introduce a scene prompt and a prompt randomization strategy to help further disentangle the domain-invariant information when training the segmentation head.
arXiv Detail & Related papers (2023-07-05T09:28:25Z)
Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement [53.2171981279647]
We present a framework that encapsulates both the VP- and variance-exploding (VE)-based diffusion methods. To improve performance and ease model training, we analyze the common difficulties encountered in diffusion models. We evaluate our model against several methods using a public benchmark to showcase the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-14T14:22:22Z)
Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
Focus on Semantic Consistency for Cross-domain Crowd Understanding [34.560447389853614]
Some domain adaptation algorithms try to liberate it by training models with synthetic data. We found that a mass of estimation errors in the background areas impede the performance of the existing methods. In this paper, we propose a domain adaptation method to eliminate it.
arXiv Detail & Related papers (2020-02-20T08:51:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.