Amodal Panoptic Segmentation
- URL: http://arxiv.org/abs/2202.11542v1
- Date: Wed, 23 Feb 2022 14:41:59 GMT
- Title: Amodal Panoptic Segmentation
- Authors: Rohit Mohan, Abhinav Valada
- Abstract summary: We formulate and propose a novel task that we name amodal panoptic segmentation.
The goal of this task is to simultaneously predict the pixel-wise semantic segmentation labels of the visible regions of stuff classes.
We propose the novel amodal panoptic segmentation network (APSNet) as a first step towards addressing this task.
- Score: 13.23676270963484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans have the remarkable ability to perceive objects as a whole, even when
parts of them are occluded. This ability of amodal perception forms the basis
of our perceptual and cognitive understanding of our world. To enable robots to
reason with this capability, we formulate and propose a novel task that we name
amodal panoptic segmentation. The goal of this task is to simultaneously
predict the pixel-wise semantic segmentation labels of the visible regions of
stuff classes and the instance segmentation labels of both the visible and
occluded regions of thing classes. To facilitate research on this new task, we
extend two established benchmark datasets with pixel-level amodal panoptic
segmentation labels that we make publicly available as KITTI-360-APS and
BDD100K-APS. We present several strong baselines, along with the amodal
panoptic quality (APQ) and amodal parsing coverage (APC) metrics to quantify
the performance in an interpretable manner. Furthermore, we propose the novel
amodal panoptic segmentation network (APSNet), as a first step towards
addressing this task by explicitly modeling the complex relationships between
the occluders and occludes. Extensive experimental evaluations demonstrate that
APSNet achieves state-of-the-art performance on both benchmarks and more
importantly exemplifies the utility of amodal recognition. The benchmarks are
available at http://amodal-panoptic.cs.uni-freiburg.de.
Related papers
- SSPA: Split-and-Synthesize Prompting with Gated Alignments for Multi-Label Image Recognition [71.90536979421093]
We propose a Split-and-Synthesize Prompting with Gated Alignments (SSPA) framework to amplify the potential of Vision-Language Models (VLMs)
We develop an in-context learning approach to associate the inherent knowledge from LLMs.
Then we propose a novel Split-and-Synthesize Prompting (SSP) strategy to first model the generic knowledge and downstream label semantics individually.
arXiv Detail & Related papers (2024-07-30T15:58:25Z) - Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model [61.389233691596004]
We introduce the DiffPNG framework, which capitalizes on the diffusion's architecture for segmentation by decomposing the process into a sequence of localization, segmentation, and refinement steps.
Our experiments on the PNG dataset demonstrate that DiffPNG achieves strong performance in the zero-shot PNG task setting.
arXiv Detail & Related papers (2024-07-07T13:06:34Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Lidar Panoptic Segmentation and Tracking without Bells and Whistles [48.078270195629415]
We propose a detection-centric network for lidar segmentation and tracking.
One of the core components of our network is the object instance detection branch.
We evaluate our method on several 3D/4D LPS benchmarks and observe that our model establishes a new state-of-the-art among open-sourced models.
arXiv Detail & Related papers (2023-10-19T04:44:43Z) - Panoptic Out-of-Distribution Segmentation [11.388678390784195]
We propose Panoptic Out-of Distribution for joint pixel-level semantic in-distribution and out-of-distribution classification with instance prediction.
We make the dataset, code, and trained models publicly available at http://pods.cs.uni-freiburg.de.
arXiv Detail & Related papers (2023-10-18T08:38:31Z) - Few-Shot Panoptic Segmentation With Foundation Models [23.231014713335664]
We propose to leverage task-agnostic image features to enable few-shot panoptic segmentation by presenting Segmenting Panoptic Information with Nearly 0 labels (SPINO)
In detail, our method combines a DINOv2 backbone with lightweight network heads for semantic segmentation and boundary estimation.
We show that our approach, albeit being trained with only ten annotated images, predicts high-quality pseudo-labels that can be used with any existing panoptic segmentation method.
arXiv Detail & Related papers (2023-09-19T16:09:01Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - Perceiving the Invisible: Proposal-Free Amodal Panoptic Segmentation [13.23676270963484]
Amodal panoptic segmentation aims to connect the perception of the world to its cognitive understanding.
We formulate a proposal-free framework that tackles this task as a multi-label and multi-class problem.
We propose the net architecture that incorporates a shared backbone and an asymmetrical dual-decoder.
arXiv Detail & Related papers (2022-05-29T12:05:07Z) - Exemplar-Based Open-Set Panoptic Segmentation Network [79.99748041746592]
We extend panoptic segmentation to the open-world and introduce an open-set panoptic segmentation (OPS) task.
We investigate the practical challenges of the task and construct a benchmark on top of an existing dataset, COCO.
We propose a novel exemplar-based open-set panoptic segmentation network (EOPSN) inspired by exemplar theory.
arXiv Detail & Related papers (2021-05-18T07:59:21Z) - MOPT: Multi-Object Panoptic Tracking [33.77171216778909]
We introduce a novel perception task denoted as multi-object panoptic tracking (MOPT)
MOPT allows for exploiting pixel-level semantic information of 'thing' and'stuff' classes, temporal coherence, and pixel-level associations over time.
We present extensive quantitative and qualitative evaluations of both vision-based and LiDAR-based MOPT that demonstrate encouraging results.
arXiv Detail & Related papers (2020-04-17T11:45:28Z) - EfficientPS: Efficient Panoptic Segmentation [13.23676270963484]
We introduce the Efficient Panoptic (EfficientPS) architecture that efficiently encodes and fuses semantically rich multi-scale features.
We incorporate a semantic head that aggregates fine and contextual features coherently and a new variant of Mask R-CNN as the instance head.
We also introduce the KITTI panoptic segmentation dataset that contains panoptic annotations for the popularly challenging KITTI benchmark.
arXiv Detail & Related papers (2020-04-05T20:15:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.