Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
- URL: http://arxiv.org/abs/2306.01721v2
- Date: Thu, 22 Jun 2023 10:23:36 GMT
- Title: Denoising Diffusion Semantic Segmentation with Mask Prior Modeling
- Authors: Zeqiang Lai, Yuchen Duan, Jifeng Dai, Ziheng Li, Ying Fu, Hongsheng
Li, Yu Qiao, Wenhai Wang
- Abstract summary: We propose to ameliorate the semantic segmentation quality of existing discriminative approaches with a mask prior modeled by a denoising diffusion generative model.
We evaluate the proposed prior modeling with several off-the-shelf segmentors, and our experimental results on ADE20K and Cityscapes demonstrate that our approach could achieve competitively quantitative performance.
- Score: 61.73352242029671
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The evolution of semantic segmentation has long been dominated by learning
more discriminative image representations for classifying each pixel. Despite
the prominent advancements, the priors of segmentation masks themselves, e.g.,
geometric and semantic constraints, are still under-explored. In this paper, we
propose to ameliorate the semantic segmentation quality of existing
discriminative approaches with a mask prior modeled by a recently-developed
denoising diffusion generative model. Beginning with a unified architecture
that adapts diffusion models for mask prior modeling, we focus this work on a
specific instantiation with discrete diffusion and identify a variety of key
design choices for its successful application. Our exploratory analysis
revealed several important findings, including: (1) a simple integration of
diffusion models into semantic segmentation is not sufficient, and a
poorly-designed diffusion process might lead to degradation in segmentation
performance; (2) during the training, the object to which noise is added is
more important than the type of noise; (3) during the inference, the strict
diffusion denoising scheme may not be essential and can be relaxed to a simpler
scheme that even works better. We evaluate the proposed prior modeling with
several off-the-shelf segmentors, and our experimental results on ADE20K and
Cityscapes demonstrate that our approach could achieve competitively
quantitative performance and more appealing visual quality.
Related papers
- Unleashing the Potential of the Diffusion Model in Few-shot Semantic Segmentation [56.87049651707208]
Few-shot Semantic has evolved into In-context tasks, morphing into a crucial element in assessing generalist segmentation models.
Our initial focus lies in understanding how to facilitate interaction between the query image and the support image, resulting in the proposal of a KV fusion method within the self-attention framework.
Based on our analysis, we establish a simple and effective framework named DiffewS, maximally retaining the original Latent Diffusion Model's generative framework.
arXiv Detail & Related papers (2024-10-03T10:33:49Z) - DiffVein: A Unified Diffusion Network for Finger Vein Segmentation and
Authentication [50.017055360261665]
We introduce DiffVein, a unified diffusion model-based framework which simultaneously addresses vein segmentation and authentication tasks.
For better feature interaction between these two branches, we introduce two specialized modules.
In this way, our framework allows for a dynamic interplay between diffusion and segmentation embeddings.
arXiv Detail & Related papers (2024-02-03T06:49:42Z) - Bridging Generative and Discriminative Models for Unified Visual
Perception with Diffusion Priors [56.82596340418697]
We propose a simple yet effective framework comprising a pre-trained Stable Diffusion (SD) model containing rich generative priors, a unified head (U-head) capable of integrating hierarchical representations, and an adapted expert providing discriminative priors.
Comprehensive investigations unveil potential characteristics of Vermouth, such as varying granularity of perception concealed in latent variables at distinct time steps and various U-net stages.
The promising results demonstrate the potential of diffusion models as formidable learners, establishing their significance in furnishing informative and robust visual representations.
arXiv Detail & Related papers (2024-01-29T10:36:57Z) - ProCNS: Progressive Prototype Calibration and Noise Suppression for
Weakly-Supervised Medical Image Segmentation [0.0]
Weakly-supervised segmentation (WSS) has emerged as a solution to mitigate conflict between annotation cost and model performance.
We propose a novel WSS approach, named ProCNS, encompassing two synergistic modules devised with the principles of progressive prototype calibration and noise suppression.
arXiv Detail & Related papers (2024-01-25T10:52:36Z) - A Recycling Training Strategy for Medical Image Segmentation with
Diffusion Denoising Models [8.649603931882227]
Denoising diffusion models have found applications in image segmentation by generating segmented masks conditioned on images.
In this work, we focus on improving the training strategy and propose a novel recycling method.
We show that, under a fair comparison with the same network architectures and computing budget, the proposed recycling-based diffusion models achieved on-par performance with non-diffusion-based supervised training.
arXiv Detail & Related papers (2023-08-30T23:03:49Z) - Masked Diffusion as Self-supervised Representation Learner [5.449210269462304]
Masked diffusion model (MDM) is a scalable self-supervised representation learner for semantic segmentation.
This paper decomposes the interrelation between the generative capability and representation learning ability inherent in diffusion models.
arXiv Detail & Related papers (2023-08-10T16:57:14Z) - Towards Better Certified Segmentation via Diffusion Models [62.21617614504225]
segmentation models can be vulnerable to adversarial perturbations, which hinders their use in critical-decision systems like healthcare or autonomous driving.
Recently, randomized smoothing has been proposed to certify segmentation predictions by adding Gaussian noise to the input to obtain theoretical guarantees.
In this paper, we address the problem of certifying segmentation prediction using a combination of randomized smoothing and diffusion models.
arXiv Detail & Related papers (2023-06-16T16:30:39Z) - CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion
Models [72.93652777646233]
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings.
We propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models.
Our method, dubbed CamoDiffusion, employs the denoising process of diffusion models to iteratively reduce the noise of the mask.
arXiv Detail & Related papers (2023-05-29T07:49:44Z) - Label-Efficient Semantic Segmentation with Diffusion Models [27.01899943738203]
We demonstrate that diffusion models can also serve as an instrument for semantic segmentation.
In particular, for several pretrained diffusion models, we investigate the intermediate activations from the networks that perform the Markov step of the reverse diffusion process.
We show that these activations effectively capture the semantic information from an input image and appear to be excellent pixel-level representations for the segmentation problem.
arXiv Detail & Related papers (2021-12-06T15:55:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.