Spatial-information Guided Adaptive Context-aware Network for Efficient
RGB-D Semantic Segmentation
- URL: http://arxiv.org/abs/2308.06024v1
- Date: Fri, 11 Aug 2023 09:02:03 GMT
- Title: Spatial-information Guided Adaptive Context-aware Network for Efficient
RGB-D Semantic Segmentation
- Authors: Yang Zhang, Chenyun Xiong, Junjie Liu, Xuhui Ye, Guodong Sun
- Abstract summary: We propose an efficient lightweight encoder-decoder network that reduces the computational parameters and guarantees the robustness of the algorithm.
Experimental results on NYUv2, SUN RGB-D, and Cityscapes datasets show that our method achieves a better trade-off among segmentation accuracy, inference time, and parameters than the state-of-the-art methods.
- Score: 9.198120596225968
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Efficient RGB-D semantic segmentation has received considerable attention in
mobile robots, which plays a vital role in analyzing and recognizing
environmental information. According to previous studies, depth information can
provide corresponding geometric relationships for objects and scenes, but
actual depth data usually exist as noise. To avoid unfavorable effects on
segmentation accuracy and computation, it is necessary to design an efficient
framework to leverage cross-modal correlations and complementary cues. In this
paper, we propose an efficient lightweight encoder-decoder network that reduces
the computational parameters and guarantees the robustness of the algorithm.
Working with channel and spatial fusion attention modules, our network
effectively captures multi-level RGB-D features. A globally guided local
affinity context module is proposed to obtain sufficient high-level context
information. The decoder utilizes a lightweight residual unit that combines
short- and long-distance information with a few redundant computations.
Experimental results on NYUv2, SUN RGB-D, and Cityscapes datasets show that our
method achieves a better trade-off among segmentation accuracy, inference time,
and parameters than the state-of-the-art methods. The source code will be at
https://github.com/MVME-HBUT/SGACNet
Related papers
- CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes [0.0]
multimodal semantic segmentation methods suffer from high computational complexity and low inference speed.
We propose the Cosine Similarity Fusion Network (CSFNet) as a real-time RGB-X semantic segmentation model.
CSFNet has competitive accuracy with state-of-the-art methods while being state-of-the-art in terms of speed.
arXiv Detail & Related papers (2024-07-01T14:34:32Z) - Optimizing rgb-d semantic segmentation through multi-modal interaction
and pooling attention [5.518612382697244]
Multi-modal Interaction and Pooling Attention Network (MIPANet) is designed to harness the interactive synergy between RGB and depth modalities.
We introduce a Pooling Attention Module (PAM) at various stages of the encoder.
This module serves to amplify the features extracted by the network and integrates the module's output into the decoder.
arXiv Detail & Related papers (2023-11-19T12:25:59Z) - LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using
Multi-Scale Convolution Attention [0.0]
We propose a projection-based semantic segmentation network called LENet with an encoder-decoder structure for LiDAR-based semantic segmentation.
The encoder is composed of a novel multi-scale convolutional attention (MSCA) module with varying receptive field sizes to capture features.
We show that our proposed method is lighter, more efficient, and robust compared to state-of-the-art semantic segmentation methods.
arXiv Detail & Related papers (2023-01-11T02:51:38Z) - CENet: Toward Concise and Efficient LiDAR Semantic Segmentation for
Autonomous Driving [4.6193503399184275]
We present a textbfconcise and textbfefficient image-based semantic segmentation network, named textbfCENet.
Our pipeline achieves much better mIoU and inference performance compared with state-of-the-art models.
arXiv Detail & Related papers (2022-07-26T07:22:19Z) - MobileSal: Extremely Efficient RGB-D Salient Object Detection [62.04876251927581]
This paper introduces a novel network, methodname, which focuses on efficient RGB-D salient object detection (SOD)
We propose an implicit depth restoration (IDR) technique to strengthen the feature representation capability of mobile networks for RGB-D SOD.
With IDR and CPR incorporated, methodnameperforms favorably against sArt methods on seven challenging RGB-D SOD datasets.
arXiv Detail & Related papers (2020-12-24T04:36:42Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Real-time Semantic Segmentation via Spatial-detail Guided Context
Propagation [49.70144583431999]
We propose the spatial-detail guided context propagation network (SGCPNet) for achieving real-time semantic segmentation.
It uses the spatial details of shallow layers to guide the propagation of the low-resolution global contexts, in which the lost spatial information can be effectively reconstructed.
It achieves 69.5% mIoU segmentation accuracy, while its speed reaches 178.5 FPS on 768x1536 images on a GeForce GTX 1080 Ti GPU card.
arXiv Detail & Related papers (2020-05-22T07:07:26Z) - Spatial Information Guided Convolution for Real-Time RGBD Semantic
Segmentation [79.78416804260668]
We propose Spatial information guided Convolution (S-Conv), which allows efficient RGB feature and 3D spatial information integration.
S-Conv is competent to infer the sampling offset of the convolution kernel guided by the 3D spatial information.
We further embed S-Conv into a semantic segmentation network, called Spatial information Guided convolutional Network (SGNet)
arXiv Detail & Related papers (2020-04-09T13:38:05Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.