GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation
in the Wild
- URL: http://arxiv.org/abs/2007.09073v1
- Date: Fri, 17 Jul 2020 15:53:40 GMT
- Title: GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation
in the Wild
- Authors: Umberto Michieli, Edoardo Borsato, Luca Rossi, Pietro Zanuttigh
- Abstract summary: We propose a framework combining higher object-level context conditioning and part-level spatial relationships to address the task.
To tackle object-level ambiguity, a class-conditioning module is introduced to retain class-level semantics.
We also propose a novel adjacency graph-based module that aims at matching the relative spatial relationships between ground truth and predicted parts.
- Score: 23.29789882934198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The semantic segmentation of parts of objects in the wild is a challenging
task in which multiple instances of objects and multiple parts within those
objects must be detected in the scene. This problem remains nowadays very
marginally explored, despite its fundamental importance towards detailed object
understanding. In this work, we propose a novel framework combining higher
object-level context conditioning and part-level spatial relationships to
address the task. To tackle object-level ambiguity, a class-conditioning module
is introduced to retain class-level semantics when learning parts-level
semantics. In this way, mid-level features carry also this information prior to
the decoding stage. To tackle part-level ambiguity and localization we propose
a novel adjacency graph-based module that aims at matching the relative spatial
relationships between ground truth and predicted parts. The experimental
evaluation on the Pascal-Part dataset shows that we achieve state-of-the-art
results on this task.
Related papers
- A Bottom-Up Approach to Class-Agnostic Image Segmentation [4.086366531569003]
We present a novel bottom-up formulation for addressing the class-agnostic segmentation problem.
We supervise our network directly on the projective sphere of its feature space.
Our bottom-up formulation exhibits exceptional generalization capability, even when trained on datasets designed for class-based segmentation.
arXiv Detail & Related papers (2024-09-20T17:56:02Z) - From Text Segmentation to Smart Chaptering: A Novel Benchmark for
Structuring Video Transcriptions [63.11097464396147]
We introduce a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse.
We also introduce an efficient hierarchical segmentation model MiniSeg, that outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-27T15:59:37Z) - Position-Aware Contrastive Alignment for Referring Image Segmentation [65.16214741785633]
We present a position-aware contrastive alignment network (PCAN) to enhance the alignment of multi-modal features.
Our PCAN consists of two modules: 1) Position Aware Module (PAM), which provides position information of all objects related to natural language descriptions, and 2) Contrastive Language Understanding Module (CLUM), which enhances multi-modal alignment.
arXiv Detail & Related papers (2022-12-27T09:13:19Z) - Self-Supervised Learning of Object Parts for Semantic Segmentation [7.99536002595393]
We argue that self-supervised learning of object parts is a solution to this issue.
Our method surpasses the state-of-the-art on three semantic segmentation benchmarks by 17%-3%.
arXiv Detail & Related papers (2022-04-27T17:55:17Z) - TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic
Segmentation [44.75300205362518]
Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations.
We propose the first top-down unsupervised semantic segmentation framework for fine-grained segmentation in extremely complicated scenarios.
Our results show that our top-down unsupervised segmentation is robust to both object-centric and scene-centric datasets.
arXiv Detail & Related papers (2021-12-02T18:59:03Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Hierarchical Pyramid Representations for Semantic Segmentation [0.0]
We learn the structures of objects and the hierarchy among objects because context is based on these intrinsic properties.
In this study, we design novel hierarchical, contextual, and multiscale pyramidal representations to capture the properties from an input image.
Our proposed method achieves state-of-the-art performance in PASCAL Context.
arXiv Detail & Related papers (2021-04-05T06:39:12Z) - ClawCraneNet: Leveraging Object-level Relation for Text-based Video
Segmentation [47.7867284770227]
Text-based video segmentation is a challenging task that segments out the natural language referred objects in videos.
We introduce a novel top-down approach by imitating how we human segment an object with the language guidance.
Our method outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-03-19T09:31:08Z) - Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z) - Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes.
BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z) - Panoptic Feature Fusion Net: A Novel Instance Segmentation Paradigm for
Biomedical and Biological Images [91.41909587856104]
We present a Panoptic Feature Fusion Net (PFFNet) that unifies the semantic and instance features in this work.
Our proposed PFFNet contains a residual attention feature fusion mechanism to incorporate the instance prediction with the semantic features.
It outperforms several state-of-the-art methods on various biomedical and biological datasets.
arXiv Detail & Related papers (2020-02-15T09:19:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.