MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model
for Few-Shot Instance Segmentation
- URL: http://arxiv.org/abs/2303.05105v2
- Date: Sun, 21 Jan 2024 23:04:32 GMT
- Title: MaskDiff: Modeling Mask Distribution with Diffusion Probabilistic Model
for Few-Shot Instance Segmentation
- Authors: Minh-Quan Le, Tam V. Nguyen, Trung-Nghia Le, Thanh-Toan Do, Minh N.
Do, Minh-Triet Tran
- Abstract summary: Few-shot instance segmentation extends the few-shot learning paradigm to the instance segmentation task.
Conventional approaches have attempted to address the task via prototype learning, known as point estimation.
We propose a novel approach, dubbed MaskDiff, which models the underlying conditional distribution of a binary mask.
- Score: 31.648523213206595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot instance segmentation extends the few-shot learning paradigm to the
instance segmentation task, which tries to segment instance objects from a
query image with a few annotated examples of novel categories. Conventional
approaches have attempted to address the task via prototype learning, known as
point estimation. However, this mechanism depends on prototypes (\eg mean of
$K-$shot) for prediction, leading to performance instability. To overcome the
disadvantage of the point estimation mechanism, we propose a novel approach,
dubbed MaskDiff, which models the underlying conditional distribution of a
binary mask, which is conditioned on an object region and $K-$shot information.
Inspired by augmentation approaches that perturb data with Gaussian noise for
populating low data density regions, we model the mask distribution with a
diffusion probabilistic model. We also propose to utilize classifier-free
guided mask sampling to integrate category information into the binary mask
generation process. Without bells and whistles, our proposed method
consistently outperforms state-of-the-art methods on both base and novel
classes of the COCO dataset while simultaneously being more stable than
existing methods. The source code is available at:
https://github.com/minhquanlecs/MaskDiff.
Related papers
- Bridge the Points: Graph-based Few-shot Segment Anything Semantically [79.1519244940518]
Recent advancements in pre-training techniques have enhanced the capabilities of vision foundation models.
Recent studies extend the SAM to Few-shot Semantic segmentation (FSS)
We propose a simple yet effective approach based on graph analysis.
arXiv Detail & Related papers (2024-10-09T15:02:28Z) - Hybrid diffusion models: combining supervised and generative pretraining for label-efficient fine-tuning of segmentation models [55.2480439325792]
We propose a new pretext task, which is to perform simultaneously image denoising and mask prediction on the first domain.
We show that fine-tuning a model pretrained using this approach leads to better results than fine-tuning a similar model trained using either supervised or unsupervised pretraining.
arXiv Detail & Related papers (2024-08-06T20:19:06Z) - MaskUno: Switch-Split Block For Enhancing Instance Segmentation [0.0]
We propose replacing mask prediction with a Switch-Split block that processes refined ROIs, classifies them, and assigns them to specialized mask predictors.
An increase in the mean Average Precision (mAP) of 2.03% was observed for the high-performing DetectoRS when trained on 80 classes.
arXiv Detail & Related papers (2024-07-31T10:12:14Z) - ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation [0.8213829427624407]
Domain adaptive semantic segmentation aims to generate accurate and dense predictions for an unlabeled target domain.
We propose the ProtoGMM model, which incorporates the GMM into contrastive losses to perform guided contrastive learning.
To achieve increased intra-class semantic similarity, decreased inter-class similarity, and domain alignment between the source and target domains, we employ multi-prototype contrastive learning.
arXiv Detail & Related papers (2024-06-27T14:50:50Z) - SegRefiner: Towards Model-Agnostic Segmentation Refinement with Discrete
Diffusion Process [102.18226145874007]
We propose a model-agnostic solution called SegRefiner to enhance the quality of object masks produced by different segmentation models.
SegRefiner takes coarse masks as inputs and refines them using a discrete diffusion process.
It consistently improves both the segmentation metrics and boundary metrics across different types of coarse masks.
arXiv Detail & Related papers (2023-12-19T18:53:47Z) - Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models [68.73086826874733]
We introduce a novel Referring Diffusional segmentor (Ref-Diff) for referring image segmentation.
We demonstrate that without a proposal generator, a generative model alone can achieve comparable performance to existing SOTA weakly-supervised models.
This indicates that generative models are also beneficial for this task and can complement discriminative models for better referring segmentation.
arXiv Detail & Related papers (2023-08-31T14:55:30Z) - DFormer: Diffusion-guided Transformer for Universal Image Segmentation [86.73405604947459]
The proposed DFormer views universal image segmentation task as a denoising process using a diffusion model.
At inference, our DFormer directly predicts the masks and corresponding categories from a set of randomly-generated masks.
Our DFormer outperforms the recent diffusion-based panoptic segmentation method Pix2Seq-D with a gain of 3.6% on MS COCO val 2017 set.
arXiv Detail & Related papers (2023-06-06T06:33:32Z) - Few-shot semantic segmentation via mask aggregation [5.886986014593717]
Few-shot semantic segmentation aims to recognize novel classes with only very few labelled data.
Previous works have typically regarded it as a pixel-wise classification problem.
We introduce a mask-based classification method for addressing this problem.
arXiv Detail & Related papers (2022-02-15T07:13:09Z) - Meta Mask Correction for Nuclei Segmentation in Histopathological Image [5.36728433027615]
We propose a novel meta-learning-based nuclei segmentation method to leverage data with noisy masks.
Specifically, we design a fully conventional meta-model that can correct noisy masks using a small amount of clean meta-data.
We show that our method achieves the state-of-the-art result.
arXiv Detail & Related papers (2021-11-24T13:53:35Z) - Attentional Prototype Inference for Few-Shot Segmentation [128.45753577331422]
We propose attentional prototype inference (API), a probabilistic latent variable framework for few-shot segmentation.
We define a global latent variable to represent the prototype of each object category, which we model as a probabilistic distribution.
We conduct extensive experiments on four benchmarks, where our proposal obtains at least competitive and often better performance than state-of-the-art prototype-based methods.
arXiv Detail & Related papers (2021-05-14T06:58:44Z) - Mask-guided sample selection for Semi-Supervised Instance Segmentation [13.091166009687058]
We propose a sample selection approach to decide which samples to annotate for semi-supervised instance segmentation.
Our method consists in first predicting pseudo-masks for the unlabeled pool of samples, together with a score predicting the quality of the mask.
We study which samples are better to annotate given the quality score, and show how our approach outperforms a random selection.
arXiv Detail & Related papers (2020-08-25T14:44:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.