Related papers: Spatial Pyramid Based Graph Reasoning for Semantic Segmentation

Spatial Pyramid Based Graph Reasoning for Semantic Segmentation

URL: http://arxiv.org/abs/2003.10211v1
Date: Mon, 23 Mar 2020 12:28:07 GMT
Title: Spatial Pyramid Based Graph Reasoning for Semantic Segmentation
Authors: Xia Li, Yibo Yang, Qijie Zhao, Tiancheng Shen, Zhouchen Lin, Hong Liu
Abstract summary: We apply graph convolution into the semantic segmentation task and propose an improved Laplacian. The graph reasoning is directly performed in the original feature space organized as a spatial pyramid. We achieve comparable performance with advantages in computational and memory overhead.
Score: 67.47159595239798
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The convolution operation suffers from a limited receptive filed, while global modeling is fundamental to dense prediction tasks, such as semantic segmentation. In this paper, we apply graph convolution into the semantic segmentation task and propose an improved Laplacian. The graph reasoning is directly performed in the original feature space organized as a spatial pyramid. Different from existing methods, our Laplacian is data-dependent and we introduce an attention diagonal matrix to learn a better distance metric. It gets rid of projecting and re-projecting processes, which makes our proposed method a light-weight module that can be easily plugged into current computer vision architectures. More importantly, performing graph reasoning directly in the feature space retains spatial relationships and makes spatial pyramid possible to explore multiple long-range contextual patterns from different scales. Experiments on Cityscapes, COCO Stuff, PASCAL Context and PASCAL VOC demonstrate the effectiveness of our proposed methods on semantic segmentation. We achieve comparable performance with advantages in computational and memory overhead.

Related papers

Open-Vocabulary Octree-Graph for 3D Scene Understanding [54.11828083068082]
Octree-Graph is a novel scene representation for open-vocabulary 3D scene understanding. An adaptive-octree structure is developed that stores semantics and depicts the occupancy of an object adjustably according to its shape.
arXiv Detail & Related papers (2024-11-25T10:14:10Z)
Disentangled Representation Learning with the Gromov-Monge Gap [65.73194652234848]
Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning. We introduce a novel approach to disentangled representation learning based on quadratic optimal transport. We demonstrate the effectiveness of our approach for quantifying disentanglement across four standard benchmarks.
arXiv Detail & Related papers (2024-07-10T16:51:32Z)
Improving embedding of graphs with missing data by soft manifolds [51.425411400683565]
The reliability of graph embeddings depends on how much the geometry of the continuous space matches the graph structure. We introduce a new class of manifold, named soft manifold, that can solve this situation. Using soft manifold for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets.
arXiv Detail & Related papers (2023-11-29T12:48:33Z)
AMES: A Differentiable Embedding Space Selection Framework for Latent Graph Inference [6.115315198322837]
We introduce the Attentional Multi-Embedding Selection (AMES) framework, a differentiable method for selecting the best embedding space for latent graph inference. Our framework consistently achieves comparable or superior results compared to previous methods for latent graph inference.
arXiv Detail & Related papers (2023-11-20T16:24:23Z)
Edge-aware Plug-and-play Scheme for Semantic Segmentation [4.297988192695948]
The proposed method can be seamlessly integrated into any state-of-the-art (SOTA) models with zero modification. The experimental results indicate that the proposed method can be seamlessly integrated into any state-of-the-art (SOTA) models with zero modification.
arXiv Detail & Related papers (2023-03-18T02:17:37Z)
BI-GCN: Boundary-Aware Input-Dependent Graph Convolution Network for Biomedical Image Segmentation [21.912509900254364]
We apply graph convolution into the segmentation task and propose an improved textitLaplacian. Our method outperforms the state-of-the-art approaches on the segmentation of polyps in colonoscopy images and of the optic disc and optic cup in colour fundus images.
arXiv Detail & Related papers (2021-10-27T21:12:27Z)
BoundarySqueeze: Image Segmentation as Boundary Squeezing [104.43159799559464]
We propose a novel method for fine-grained high-quality image segmentation of both objects and scenes. Inspired by dilation and erosion from morphological image processing techniques, we treat the pixel level segmentation problems as squeezing object boundary. Our method yields large gains on COCO, Cityscapes, for both instance and semantic segmentation and outperforms previous state-of-the-art PointRend in both accuracy and speed under the same setting.
arXiv Detail & Related papers (2021-05-25T04:58:51Z)
Graph Networks with Spectral Message Passing [1.0742675209112622]
We introduce the Spectral Graph Network, which applies message passing to both the spatial and spectral domains. Our results show that the Spectral GN promotes efficient training, reaching high performance with fewer training iterations despite having more parameters.
arXiv Detail & Related papers (2020-12-31T21:33:17Z)
Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning. Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector. We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.