BLADE: Box-Level Supervised Amodal Segmentation through Directed
Expansion
- URL: http://arxiv.org/abs/2401.01642v3
- Date: Sun, 25 Feb 2024 09:13:18 GMT
- Title: BLADE: Box-Level Supervised Amodal Segmentation through Directed
Expansion
- Authors: Zhaochen Liu, Zhixuan Li, Tingting Jiang
- Abstract summary: Box-level supervised amodal segmentation addresses this challenge by relying solely on ground truth bounding boxes and instance classes as supervision.
We present a novel solution by introducing a directed expansion approach from visible masks to corresponding amodal masks.
Our approach involves a hybrid end-to-end network based on the overlapping region - the area where different instances intersect.
- Score: 10.57956193654977
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Perceiving the complete shape of occluded objects is essential for human and
machine intelligence. While the amodal segmentation task is to predict the
complete mask of partially occluded objects, it is time-consuming and
labor-intensive to annotate the pixel-level ground truth amodal masks.
Box-level supervised amodal segmentation addresses this challenge by relying
solely on ground truth bounding boxes and instance classes as supervision,
thereby alleviating the need for exhaustive pixel-level annotations.
Nevertheless, current box-level methodologies encounter limitations in
generating low-resolution masks and imprecise boundaries, failing to meet the
demands of practical real-world applications. We present a novel solution to
tackle this problem by introducing a directed expansion approach from visible
masks to corresponding amodal masks. Our approach involves a hybrid end-to-end
network based on the overlapping region - the area where different instances
intersect. Diverse segmentation strategies are applied for overlapping regions
and non-overlapping regions according to distinct characteristics. To guide the
expansion of visible masks, we introduce an elaborately-designed connectivity
loss for overlapping regions, which leverages correlations with visible masks
and facilitates accurate amodal segmentation. Experiments are conducted on
several challenging datasets and the results show that our proposed method can
outperform existing state-of-the-art methods with large margins.
Related papers
- Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning [50.88504784466931]
Multi-task dense prediction involves semantic segmentation, depth estimation, and surface normal estimation.
Existing solutions typically rely on learning global image representations for global cross-task image matching.
Our proposal involves modeling region-wise representations using Gaussian Distributions.
arXiv Detail & Related papers (2024-03-15T12:41:30Z) - Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision [87.15580604023555]
Unpair-Seg is a novel weakly-supervised open-vocabulary segmentation framework.
It learns from unpaired image-mask and image-text pairs, which can be independently and efficiently collected.
It achieves 14.6% and 19.5% mIoU on the ADE-847 and PASCAL Context-459 datasets.
arXiv Detail & Related papers (2024-02-14T06:01:44Z) - Generalizable Entity Grounding via Assistance of Large Language Model [77.07759442298666]
We propose a novel approach to densely ground visual entities from a long caption.
We leverage a large multimodal model to extract semantic nouns, a class-a segmentation model to generate entity-level segmentation, and a multi-modal feature fusion module to associate each semantic noun with its corresponding segmentation mask.
arXiv Detail & Related papers (2024-02-04T16:06:05Z) - Mask2Anomaly: Mask Transformer for Universal Open-set Segmentation [29.43462426812185]
We propose a paradigm change by shifting from a per-pixel classification to a mask classification.
Our mask-based method, Mask2Anomaly, demonstrates the feasibility of integrating a mask-classification architecture.
By comprehensive qualitative and qualitative evaluation, we show Mask2Anomaly achieves new state-of-the-art results.
arXiv Detail & Related papers (2023-09-08T20:07:18Z) - Exploiting Shape Cues for Weakly Supervised Semantic Segmentation [15.791415215216029]
Weakly supervised semantic segmentation (WSSS) aims to produce pixel-wise class predictions with only image-level labels for training.
We propose to exploit shape information to supplement the texture-biased property of convolutional neural networks (CNNs)
We further refine the predictions in an online fashion with a novel refinement method that takes into account both the class and the color affinities.
arXiv Detail & Related papers (2022-08-08T17:25:31Z) - Perceiving the Invisible: Proposal-Free Amodal Panoptic Segmentation [13.23676270963484]
Amodal panoptic segmentation aims to connect the perception of the world to its cognitive understanding.
We formulate a proposal-free framework that tackles this task as a multi-label and multi-class problem.
We propose the net architecture that incorporates a shared backbone and an asymmetrical dual-decoder.
arXiv Detail & Related papers (2022-05-29T12:05:07Z) - Semantic Attention and Scale Complementary Network for Instance
Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB)
SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map.
SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z) - The Devil is in the Boundary: Exploiting Boundary Representation for
Basis-based Instance Segmentation [85.153426159438]
We propose Basis based Instance(B2Inst) to learn a global boundary representation that can complement existing global-mask-based methods.
Our B2Inst leads to consistent improvements and accurately parses out the instance boundaries in a scene.
arXiv Detail & Related papers (2020-11-26T11:26:06Z) - Self-Supervised Scene De-occlusion [186.89979151728636]
This paper investigates the problem of scene de-occlusion, which aims to recover the underlying occlusion ordering and complete the invisible parts of occluded objects.
We make the first attempt to address the problem through a novel and unified framework that recovers hidden scene structures without ordering and amodal annotations as supervisions.
Based on PCNet-M and PCNet-C, we devise a novel inference scheme to accomplish scene de-occlusion, via progressive ordering recovery, amodal completion and content completion.
arXiv Detail & Related papers (2020-04-06T16:31:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.