Hierarchical Pyramid Representations for Semantic Segmentation
- URL: http://arxiv.org/abs/2104.01792v1
- Date: Mon, 5 Apr 2021 06:39:12 GMT
- Title: Hierarchical Pyramid Representations for Semantic Segmentation
- Authors: Hiroaki Aizawa, Yukihiro Domae, Kunihito Kato
- Abstract summary: We learn the structures of objects and the hierarchy among objects because context is based on these intrinsic properties.
In this study, we design novel hierarchical, contextual, and multiscale pyramidal representations to capture the properties from an input image.
Our proposed method achieves state-of-the-art performance in PASCAL Context.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the context of complex and cluttered scenes is a challenging
problem for semantic segmentation. However, it is difficult to model the
context without prior and additional supervision because the scene's factors,
such as the scale, shape, and appearance of objects, vary considerably in these
scenes. To solve this, we propose to learn the structures of objects and the
hierarchy among objects because context is based on these intrinsic properties.
In this study, we design novel hierarchical, contextual, and multiscale
pyramidal representations to capture the properties from an input image. Our
key idea is the recursive segmentation in different hierarchical regions based
on a predefined number of regions and the aggregation of the context in these
regions. The aggregated contexts are used to predict the contextual
relationship between the regions and partition the regions in the following
hierarchical level. Finally, by constructing the pyramid representations from
the recursively aggregated context, multiscale and hierarchical properties are
attained. In the experiments, we confirmed that our proposed method achieves
state-of-the-art performance in PASCAL Context.
Related papers
- SPIN: Hierarchical Segmentation with Subpart Granularity in Natural Images [17.98848062686217]
We introduce the first hierarchical semantic segmentation dataset with subpart annotations for natural images.
We also introduce two novel evaluation metrics to evaluate how well algorithms capture spatial and semantic relationships across hierarchical levels.
arXiv Detail & Related papers (2024-07-12T21:08:00Z) - Flattening the Parent Bias: Hierarchical Semantic Segmentation in the Poincaré Ball [39.76366192826905]
We show that a flat (non-hierarchical) segmentation network, in which the parents are inferred from the children, has superior segmentation accuracy to the hierarchical approach across the board.
We also study a more principled approach to hierarchical segmentation using the Poincar'e ball model.
arXiv Detail & Related papers (2024-04-04T19:50:57Z) - From Text Segmentation to Smart Chaptering: A Novel Benchmark for
Structuring Video Transcriptions [63.11097464396147]
We introduce a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse.
We also introduce an efficient hierarchical segmentation model MiniSeg, that outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-27T15:59:37Z) - Neural Constraint Satisfaction: Hierarchical Abstraction for
Combinatorial Generalization in Object Rearrangement [75.9289887536165]
We present a hierarchical abstraction approach to uncover underlying entities.
We show how to learn a correspondence between intervening on states of entities in the agent's model and acting on objects in the environment.
We use this correspondence to develop a method for control that generalizes to different numbers and configurations of objects.
arXiv Detail & Related papers (2023-03-20T18:19:36Z) - Framework-agnostic Semantically-aware Global Reasoning for Segmentation [29.69187816377079]
We propose a component that learns to project image features into latent representations and reason between them.
Our design encourages the latent regions to represent semantic concepts by ensuring that the activated regions are spatially disjoint.
Our latent tokens are semantically interpretable and diverse and provide a rich set of features that can be transferred to downstream tasks.
arXiv Detail & Related papers (2022-12-06T21:42:05Z) - Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised
Semantic Segmentation and Localization [98.46318529630109]
We take inspiration from traditional spectral segmentation methods by reframing image decomposition as a graph partitioning problem.
We find that these eigenvectors already decompose an image into meaningful segments, and can be readily used to localize objects in a scene.
By clustering the features associated with these segments across a dataset, we can obtain well-delineated, nameable regions.
arXiv Detail & Related papers (2022-05-16T17:47:44Z) - Compositional Temporal Grounding with Structured Variational Cross-Graph
Correspondence Learning [92.07643510310766]
Temporal grounding in videos aims to localize one target video segment that semantically corresponds to a given query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We empirically find that they fail to generalize to queries with novel combinations of seen words.
We propose a variational cross-graph reasoning framework that explicitly decomposes video and language into multiple structured hierarchies.
arXiv Detail & Related papers (2022-03-24T12:55:23Z) - Exploring Set Similarity for Dense Self-supervised Representation
Learning [96.35286140203407]
We propose to explore textbfset textbfsimilarity (SetSim) for dense self-supervised representation learning.
We generalize pixel-wise similarity learning to set-wise one to improve the robustness because sets contain more semantic and structure information.
Specifically, by resorting to attentional features of views, we establish corresponding sets, thus filtering out noisy backgrounds that may cause incorrect correspondences.
arXiv Detail & Related papers (2021-07-19T09:38:27Z) - PhraseCut: Language-based Image Segmentation in the Wild [62.643450401286]
We consider the problem of segmenting image regions given a natural language phrase.
Our dataset is collected on top of the Visual Genome dataset.
Our experiments show that the scale and diversity of concepts in our dataset poses significant challenges to the existing state-of-the-art.
arXiv Detail & Related papers (2020-08-03T20:58:53Z) - GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation
in the Wild [23.29789882934198]
We propose a framework combining higher object-level context conditioning and part-level spatial relationships to address the task.
To tackle object-level ambiguity, a class-conditioning module is introduced to retain class-level semantics.
We also propose a novel adjacency graph-based module that aims at matching the relative spatial relationships between ground truth and predicted parts.
arXiv Detail & Related papers (2020-07-17T15:53:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.