JPPF: Multi-task Fusion for Consistent Panoptic-Part Segmentation
- URL: http://arxiv.org/abs/2311.18618v1
- Date: Thu, 30 Nov 2023 15:17:46 GMT
- Title: JPPF: Multi-task Fusion for Consistent Panoptic-Part Segmentation
- Authors: Shishir Muralidhara, Sravan Kumar Jagadeesh, Ren\'e Schuster, Didier
Stricker
- Abstract summary: Part-aware panoptic segmentation is a problem of computer vision that aims to provide a semantic understanding of the scene at multiple levels of granularity.
We present our Joint Panoptic Part Fusion (JPPF) that combines the three individual segmentations effectively to obtain a panoptic-part segmentation.
- Score: 12.19926973291957
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Part-aware panoptic segmentation is a problem of computer vision that aims to
provide a semantic understanding of the scene at multiple levels of
granularity. More precisely, semantic areas, object instances, and semantic
parts are predicted simultaneously. In this paper, we present our Joint
Panoptic Part Fusion (JPPF) that combines the three individual segmentations
effectively to obtain a panoptic-part segmentation. Two aspects are of utmost
importance for this: First, a unified model for the three problems is desired
that allows for mutually improved and consistent representation learning.
Second, balancing the combination so that it gives equal importance to all
individual results during fusion. Our proposed JPPF is parameter-free and
dynamically balances its input. The method is evaluated and compared on the
Cityscapes Panoptic Parts (CPP) and Pascal Panoptic Parts (PPP) datasets in
terms of PartPQ and Part-Whole Quality (PWQ). In extensive experiments, we
verify the importance of our fair fusion, highlight its most significant impact
for areas that can be further segmented into parts, and demonstrate the
generalization capabilities of our design without fine-tuning on 5 additional
datasets.
Related papers
- OV-PARTS: Towards Open-Vocabulary Part Segmentation [31.136262413989858]
Segmenting and recognizing diverse object parts is a crucial ability in applications spanning various computer vision and robotic tasks.
We propose an Open-Vocabulary Part (OV-PARTS) benchmark to investigate and tackle these challenges.
OV-PARTS includes refined versions of two publicly available datasets: Pascal-Part-116 and ADE20K--234. And it covers three specific tasks: Generalized Zero-Shot Part analog, Cross-Dataset Part, and Few-Shot Part.
arXiv Detail & Related papers (2023-10-08T10:28:42Z) - Multi-interactive Feature Learning and a Full-time Multi-modality
Benchmark for Image Fusion and Segmentation [66.15246197473897]
Multi-modality image fusion and segmentation play a vital role in autonomous driving and robotic operation.
We propose a textbfMulti-textbfinteractive textbfFeature learning architecture for image fusion and textbfSegmentation.
arXiv Detail & Related papers (2023-08-04T01:03:58Z) - Multi-Grained Multimodal Interaction Network for Entity Linking [65.30260033700338]
Multimodal entity linking task aims at resolving ambiguous mentions to a multimodal knowledge graph.
We propose a novel Multi-GraIned Multimodal InteraCtion Network $textbf(MIMIC)$ framework for solving the MEL task.
arXiv Detail & Related papers (2023-07-19T02:11:19Z) - Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and
Motion Estimation [49.56131393810713]
We present an SE(3) equivariant architecture and a training strategy to tackle this task in an unsupervised manner.
Our method excels in both model performance and computational efficiency, with only 0.25M parameters and 0.92G FLOPs.
arXiv Detail & Related papers (2023-06-08T22:55:32Z) - PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part
Segmentation [152.66484428389364]
Panoptic Part (PPS) unifies panoptic and part segmentation into one task.
We design the first end-to-end unified framework, Panoptic-PartFormer.
Our models can serve as a strong baseline and aid future research in PPS.
arXiv Detail & Related papers (2023-01-03T05:30:56Z) - Part-guided Relational Transformers for Fine-grained Visual Recognition [59.20531172172135]
We propose a framework to learn the discriminative part features and explore correlations with a feature transformation module.
Our proposed approach does not rely on additional part branches and reaches state-the-of-art performance on 3-of-the-level object recognition.
arXiv Detail & Related papers (2022-12-28T03:45:56Z) - Multi-task Fusion for Efficient Panoptic-Part Segmentation [12.650574326251023]
We introduce a novel network that generates semantic, instance, and part segmentation using a shared encoder.
To fuse the predictions of all three heads efficiently, we introduce a parameter-free joint fusion module.
Our method is evaluated on the Cityscapes Panoptic Parts ( CPP) and Pascal Panoptic Parts (PPP) datasets.
arXiv Detail & Related papers (2022-12-15T09:04:45Z) - Panoptic-PartFormer: Learning a Unified Model for Panoptic Part
Segmentation [76.9420522112248]
Panoptic Part (PPS) aims to unify panoptic segmentation and part segmentation into one task.
We design the first end-to-end unified method named Panoptic-PartFormer.
Our Panoptic-PartFormer achieves the new state-of-the-art results on both Cityscapes PPS and Pascal Context PPS datasets.
arXiv Detail & Related papers (2022-04-10T11:16:45Z) - Part-aware Panoptic Segmentation [3.342126234995932]
Part-aware Panoptic (PPS) aims to understand a scene at multiple levels of abstraction.
We provide consistent annotations on two commonly used datasets: Cityscapes and Pascal VOC.
We present a single metric to evaluate PPS, called Part-aware Panoptic Quality (PartPQ)
arXiv Detail & Related papers (2021-06-11T12:48:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.