Residual Graph Convolutional Network for Bird's-Eye-View Semantic
Segmentation
- URL: http://arxiv.org/abs/2312.04044v1
- Date: Thu, 7 Dec 2023 05:04:41 GMT
- Title: Residual Graph Convolutional Network for Bird's-Eye-View Semantic
Segmentation
- Authors: Qiuxiao Chen and Xiaojun Qi
- Abstract summary: We propose to incorporate a novel Residual Graph Convolutional (RGC) module in deep CNNs.
RGC module efficiently project the complete Bird's-Eye-View (BEV) information into graph space.
RGC network outperforms four state-of-the-art networks and its four variants in terms of IoU and mIoU.
- Score: 3.8073142980733
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Retrieving spatial information and understanding the semantic information of
the surroundings are important for Bird's-Eye-View (BEV) semantic segmentation.
In the application of autonomous driving, autonomous vehicles need to be aware
of their surroundings to drive safely. However, current BEV semantic
segmentation techniques, deep Convolutional Neural Networks (CNNs) and
transformers, have difficulties in obtaining the global semantic relationships
of the surroundings at the early layers of the network. In this paper, we
propose to incorporate a novel Residual Graph Convolutional (RGC) module in
deep CNNs to acquire both the global information and the region-level semantic
relationship in the multi-view image domain. Specifically, the RGC module
employs a non-overlapping graph space projection to efficiently project the
complete BEV information into graph space. It then builds interconnected
spatial and channel graphs to extract spatial information between each node and
channel information within each node (i.e., extract contextual relationships of
the global features). Furthermore, it uses a downsample residual process to
enhance the coordinate feature reuse to maintain the global information. The
segmentation data augmentation and alignment module helps to simultaneously
augment and align BEV features and ground truth to geometrically preserve their
alignment to achieve better segmentation results. Our experimental results on
the nuScenes benchmark dataset demonstrate that the RGC network outperforms
four state-of-the-art networks and its four variants in terms of IoU and mIoU.
The proposed RGC network achieves a higher mIoU of 3.1% than the best
state-of-the-art network, BEVFusion. Code and models will be released.
Related papers
- SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks [0.0]
This research proposes an encoder-decoder architecture with a unique efficient residual network, Efficient-ResNet.
Attention-boosting gates (AbGs) and attention-boosting modules (AbMs) are deployed by aiming to fuse the equivariant and feature-based semantic information with the equivalent sizes of the output of global context.
Our network is tested on the challenging CamVid and Cityscapes datasets, and the proposed methods reveal significant improvements on the residual networks.
arXiv Detail & Related papers (2024-01-28T19:58:19Z) - DGNN: Decoupled Graph Neural Networks with Structural Consistency
between Attribute and Graph Embedding Representations [62.04558318166396]
Graph neural networks (GNNs) demonstrate a robust capability for representation learning on graphs with complex structures.
A novel GNNs framework, dubbed Decoupled Graph Neural Networks (DGNN), is introduced to obtain a more comprehensive embedding representation of nodes.
Experimental results conducted on several graph benchmark datasets verify DGNN's superiority in node classification task.
arXiv Detail & Related papers (2024-01-28T06:43:13Z) - Graph Information Bottleneck for Remote Sensing Segmentation [8.879224757610368]
This paper treats images as graph structures and introduces a simple contrastive vision GNN architecture for remote sensing segmentation.
Specifically, we construct a node-masked and edge-masked graph view to obtain an optimal graph structure representation.
We replace the convolutional module in UNet with the SC-ViG module to complete the segmentation and classification tasks.
arXiv Detail & Related papers (2023-12-05T07:23:22Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - RSI-Net: Two-Stream Deep Neural Network Integrating GCN and Atrous CNN
for Semantic Segmentation of High-resolution Remote Sensing Images [3.468780866037609]
Two-stream deep neural network for semantic segmentation of remote sensing images (RSI-Net) is proposed in this paper.
Experiments are implemented on the Vaihingen, Potsdam and Gaofen RSI datasets.
Results demonstrate the superior performance of RSI-Net in terms of overall accuracy, F1 score and kappa coefficient when compared with six state-of-the-art RSI semantic segmentation methods.
arXiv Detail & Related papers (2021-09-19T15:57:20Z) - S3Net: 3D LiDAR Sparse Semantic Segmentation Network [1.330528227599978]
S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation.
It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
arXiv Detail & Related papers (2021-03-15T22:15:24Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z) - Spatial Information Guided Convolution for Real-Time RGBD Semantic
Segmentation [79.78416804260668]
We propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration.
S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information.
We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet)
arXiv Detail & Related papers (2020-04-09T13:38:05Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.