Amodal Segmentation Based on Visible Region Segmentation and Shape Prior
- URL: http://arxiv.org/abs/2012.05598v2
- Date: Sat, 19 Dec 2020 13:24:36 GMT
- Title: Amodal Segmentation Based on Visible Region Segmentation and Shape Prior
- Authors: Yuting Xiao, Yanyu Xu, Ziming Zhong, Weixin Luo, Jiawei Li, Shenghua
Gao
- Abstract summary: We propose a framework to mimic the behavior of human and solve the ambiguity in the learning.
Our model infers the amodal mask by concentrating on the visible region and utilizing the shape prior in the memory.
Experiments show that our proposed model outperforms existing state-of-the-art methods.
- Score: 43.40655235118393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Almost all existing amodal segmentation methods make the inferences of
occluded regions by using features corresponding to the whole image. This is
against the human's amodal perception, where human uses the visible part and
the shape prior knowledge of the target to infer the occluded region. To mimic
the behavior of human and solve the ambiguity in the learning, we propose a
framework, it firstly estimates a coarse visible mask and a coarse amodal mask.
Then based on the coarse prediction, our model infers the amodal mask by
concentrating on the visible region and utilizing the shape prior in the
memory. In this way, features corresponding to background and occlusion can be
suppressed for amodal mask estimation. Consequently, the amodal mask would not
be affected by what the occlusion is given the same visible regions. The
leverage of shape prior makes the amodal mask estimation more robust and
reasonable. Our proposed model is evaluated on three datasets. Experiments show
that our proposed model outperforms existing state-of-the-art methods. The
visualization of shape prior indicates that the category-specific feature in
the codebook has certain interpretability.
Related papers
- Amodal Instance Segmentation with Diffusion Shape Prior Estimation [10.064183379778388]
Amodal Instance (AIS) presents an intriguing challenge, including the segmentation prediction of both visible and occluded parts of objects within images.
Previous methods have often relied on shape prior information gleaned from training data to enhance amodal segmentation.
Recent advancements highlight the potential of conditioned diffusion models, pretrained on extensive datasets, to generate images from latent space.
arXiv Detail & Related papers (2024-09-26T19:59:12Z) - MaskInversion: Localized Embeddings via Optimization of Explainability Maps [49.50785637749757]
MaskInversion generates a context-aware embedding for a query image region specified by a mask at test time.
It can be used for a broad range of tasks, including open-vocabulary class retrieval, referring expression comprehension, as well as for localized captioning and image generation.
arXiv Detail & Related papers (2024-07-29T14:21:07Z) - Sequential Amodal Segmentation via Cumulative Occlusion Learning [15.729212571002906]
A visual system must be able to segment both the visible and occluded regions of objects, while discerning their occlusion order.
We introduce a diffusion model with cumulative occlusion learning designed for sequential amodal segmentation of objects with uncertain categories.
This model iteratively refines the prediction using the cumulative mask strategy during diffusion, effectively capturing the uncertainty of invisible regions.
It is akin to the human capability for amodal perception, i.e., to decipher the spatial ordering among objects and accurately predict complete contours for occluded objects in densely layered visual scenes.
arXiv Detail & Related papers (2024-05-09T14:17:26Z) - ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation [11.51684042494713]
We introduce ShapeFormer, a Transformer-based model with a visible-to-amodal transition.
It facilitates the explicit relationship between output segmentations and avoids the need for amodal-to-visible transitions.
ShapeFormer comprises three key modules: Visible-Occluding Mask Head for predicting visible segmentation with occlusion awareness, (ii) Shape-Prior Amodal Mask Head for predicting amodal and occluded masks, and (iii) Category-Specific Shape Prior Retriever to provide shape prior knowledge.
arXiv Detail & Related papers (2024-03-18T00:03:48Z) - BLADE: Box-Level Supervised Amodal Segmentation through Directed
Expansion [10.57956193654977]
Box-level supervised amodal segmentation addresses this challenge by relying solely on ground truth bounding boxes and instance classes as supervision.
We present a novel solution by introducing a directed expansion approach from visible masks to corresponding amodal masks.
Our approach involves a hybrid end-to-end network based on the overlapping region - the area where different instances intersect.
arXiv Detail & Related papers (2024-01-03T09:37:03Z) - Amodal Ground Truth and Completion in the Wild [84.54972153436466]
We use 3D data to establish an automatic pipeline to determine authentic ground truth amodal masks for partially occluded objects in real images.
This pipeline is used to construct an amodal completion evaluation benchmark, MP3D-Amodal, consisting of a variety of object categories and labels.
arXiv Detail & Related papers (2023-12-28T18:59:41Z) - Denoising Diffusion Semantic Segmentation with Mask Prior Modeling [61.73352242029671]
We propose to ameliorate the semantic segmentation quality of existing discriminative approaches with a mask prior modeled by a denoising diffusion generative model.
We evaluate the proposed prior modeling with several off-the-shelf segmentors, and our experimental results on ADE20K and Cityscapes demonstrate that our approach could achieve competitively quantitative performance.
arXiv Detail & Related papers (2023-06-02T17:47:01Z) - Towards Improved Input Masking for Convolutional Neural Networks [66.99060157800403]
We propose a new masking method for CNNs we call layer masking.
We show that our method is able to eliminate or minimize the influence of the mask shape or color on the output of the model.
We also demonstrate how the shape of the mask may leak information about the class, thus affecting estimates of model reliance on class-relevant features.
arXiv Detail & Related papers (2022-11-26T19:31:49Z) - What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum.
Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks.
We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z) - A Weakly Supervised Amodal Segmenter with Boundary Uncertainty
Estimation [35.103437828235826]
This paper addresses weakly supervised amodal instance segmentation.
The goal is to segment both visible and occluded (amodal) object parts, while training provides only ground-truth visible (modal) segmentations.
arXiv Detail & Related papers (2021-08-23T02:27:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.