Perceiving the Invisible: Proposal-Free Amodal Panoptic Segmentation
- URL: http://arxiv.org/abs/2205.14637v1
- Date: Sun, 29 May 2022 12:05:07 GMT
- Title: Perceiving the Invisible: Proposal-Free Amodal Panoptic Segmentation
- Authors: Rohit Mohan and Abhinav Valada
- Abstract summary: Amodal panoptic segmentation aims to connect the perception of the world to its cognitive understanding.
We formulate a proposal-free framework that tackles this task as a multi-label and multi-class problem.
We propose the net architecture that incorporates a shared backbone and an asymmetrical dual-decoder.
- Score: 13.23676270963484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Amodal panoptic segmentation aims to connect the perception of the world to
its cognitive understanding. It entails simultaneously predicting the semantic
labels of visible scene regions and the entire shape of traffic participant
instances, including regions that may be occluded. In this work, we formulate a
proposal-free framework that tackles this task as a multi-label and multi-class
problem by first assigning the amodal masks to different layers according to
their relative occlusion order and then employing amodal instance regression on
each layer independently while learning background semantics. We propose the
\net architecture that incorporates a shared backbone and an asymmetrical
dual-decoder consisting of several modules to facilitate within-scale and
cross-scale feature aggregations, bilateral feature propagation between
decoders, and integration of global instance-level and local pixel-level
occlusion reasoning. Further, we propose the amodal mask refiner that resolves
the ambiguity in complex occlusion scenarios by explicitly leveraging the
embedding of unoccluded instance masks. Extensive evaluation on the BDD100K-APS
and KITTI-360-APS datasets demonstrate that our approach set the new
state-of-the-art on both benchmarks.
Related papers
- N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields [112.02885337510716]
Nested Neural Feature Fields (N2F2) is a novel approach that employs hierarchical supervision to learn a single feature field.
We leverage a 2D class-agnostic segmentation model to provide semantically meaningful pixel groupings at arbitrary scales in the image space.
Our approach outperforms the state-of-the-art feature field distillation methods on tasks such as open-vocabulary 3D segmentation and localization.
arXiv Detail & Related papers (2024-03-16T18:50:44Z) - Generalizable Entity Grounding via Assistance of Large Language Model [77.07759442298666]
We propose a novel approach to densely ground visual entities from a long caption.
We leverage a large multimodal model to extract semantic nouns, a class-a segmentation model to generate entity-level segmentation, and a multi-modal feature fusion module to associate each semantic noun with its corresponding segmentation mask.
arXiv Detail & Related papers (2024-02-04T16:06:05Z) - BLADE: Box-Level Supervised Amodal Segmentation through Directed
Expansion [10.57956193654977]
Box-level supervised amodal segmentation addresses this challenge by relying solely on ground truth bounding boxes and instance classes as supervision.
We present a novel solution by introducing a directed expansion approach from visible masks to corresponding amodal masks.
Our approach involves a hybrid end-to-end network based on the overlapping region - the area where different instances intersect.
arXiv Detail & Related papers (2024-01-03T09:37:03Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Amodal Intra-class Instance Segmentation: Synthetic Datasets and
Benchmark [17.6780586288079]
This paper introduces two new amodal datasets for image amodal completion tasks.
We also present a point-supervised scheme with layer priors for amodal instance segmentation.
Experiments show that our weakly supervised approach outperforms the SOTA fully supervised methods.
arXiv Detail & Related papers (2023-03-12T07:28:36Z) - Beyond the Prototype: Divide-and-conquer Proxies for Few-shot
Segmentation [63.910211095033596]
Few-shot segmentation aims to segment unseen-class objects given only a handful of densely labeled samples.
We propose a simple yet versatile framework in the spirit of divide-and-conquer.
Our proposed approach, named divide-and-conquer proxies (DCP), allows for the development of appropriate and reliable information.
arXiv Detail & Related papers (2022-04-21T06:21:14Z) - Amodal Panoptic Segmentation [13.23676270963484]
We formulate and propose a novel task that we name amodal panoptic segmentation.
The goal of this task is to simultaneously predict the pixel-wise semantic segmentation labels of the visible regions of stuff classes.
We propose the novel amodal panoptic segmentation network (APSNet) as a first step towards addressing this task.
arXiv Detail & Related papers (2022-02-23T14:41:59Z) - Semantic Attention and Scale Complementary Network for Instance
Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB)
SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map.
SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z) - The Devil is in the Boundary: Exploiting Boundary Representation for
Basis-based Instance Segmentation [85.153426159438]
We propose Basis based Instance(B2Inst) to learn a global boundary representation that can complement existing global-mask-based methods.
Our B2Inst leads to consistent improvements and accurately parses out the instance boundaries in a scene.
arXiv Detail & Related papers (2020-11-26T11:26:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.