Pixel Difference Convolutional Network for RGB-D Semantic Segmentation
- URL: http://arxiv.org/abs/2302.11951v1
- Date: Thu, 23 Feb 2023 12:01:22 GMT
- Title: Pixel Difference Convolutional Network for RGB-D Semantic Segmentation
- Authors: Jun Yang, Lizhi Bai, Yaoru Sun, Chunqi Tian, Maoyu Mao, Guorun Wang
- Abstract summary: RGB-D semantic segmentation can be advanced with convolutional neural networks due to the availability of Depth data.
Considering the fixed grid kernel structure, CNNs are limited to the ability to capture detailed, fine-grained information.
We propose a Pixel Difference Convolutional Network (PDCNet) to capture detailed intrinsic patterns by aggregating both intensity and gradient information.
- Score: 2.334574428469772
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: RGB-D semantic segmentation can be advanced with convolutional neural
networks due to the availability of Depth data. Although objects cannot be
easily discriminated by just the 2D appearance, with the local pixel difference
and geometric patterns in Depth, they can be well separated in some cases.
Considering the fixed grid kernel structure, CNNs are limited to lack the
ability to capture detailed, fine-grained information and thus cannot achieve
accurate pixel-level semantic segmentation. To solve this problem, we propose a
Pixel Difference Convolutional Network (PDCNet) to capture detailed intrinsic
patterns by aggregating both intensity and gradient information in the local
range for Depth data and global range for RGB data, respectively. Precisely,
PDCNet consists of a Depth branch and an RGB branch. For the Depth branch, we
propose a Pixel Difference Convolution (PDC) to consider local and detailed
geometric information in Depth data via aggregating both intensity and gradient
information. For the RGB branch, we contribute a lightweight Cascade Large
Kernel (CLK) to extend PDC, namely CPDC, to enjoy global contexts for RGB data
and further boost performance. Consequently, both modal data's local and global
pixel differences are seamlessly incorporated into PDCNet during the
information propagation process. Experiments on two challenging benchmark
datasets, i.e., NYUDv2 and SUN RGB-D reveal that our PDCNet achieves
state-of-the-art performance for the semantic segmentation task.
Related papers
- Spatial-information Guided Adaptive Context-aware Network for Efficient
RGB-D Semantic Segmentation [9.198120596225968]
We propose an efficient lightweight encoder-decoder network that reduces the computational parameters and guarantees the robustness of the algorithm.
Experimental results on NYUv2, SUN RGB-D, and Cityscapes datasets show that our method achieves a better trade-off among segmentation accuracy, inference time, and parameters than the state-of-the-art methods.
arXiv Detail & Related papers (2023-08-11T09:02:03Z) - Spherical Space Feature Decomposition for Guided Depth Map
Super-Resolution [123.04455334124188]
Guided depth map super-resolution (GDSR) aims to upsample low-resolution (LR) depth maps with additional information involved in high-resolution (HR) RGB images from the same scene.
In this paper, we propose the Spherical Space feature Decomposition Network (SSDNet) to solve the above issues.
Our method can achieve state-of-the-art results on four test datasets, as well as successfully generalize to real-world scenes.
arXiv Detail & Related papers (2023-03-15T21:22:21Z) - DCANet: Differential Convolution Attention Network for RGB-D Semantic
Segmentation [2.2032272277334375]
We propose a pixel differential convolution attention (DCA) module to consider geometric information and local-range correlations for depth data.
We extend DCA to ensemble differential convolution attention (EDCA) which propagates long-range contextual dependencies.
A two-branch network built with DCA and EDCA, called Differential Convolutional Network (DCANet), is proposed to fuse local and global information of two-modal data.
arXiv Detail & Related papers (2022-10-13T05:17:34Z) - Depth-Adapted CNNs for RGB-D Semantic Segmentation [2.341385717236931]
We propose a novel framework to incorporate the depth information in the RGB convolutional neural network (CNN)
Specifically, our Z-ACN generates a 2D depth-adapted offset which is fully constrained by low-level features to guide the feature extraction on RGB images.
With the generated offset, we introduce two intuitive and effective operations to replace basic CNN operators.
arXiv Detail & Related papers (2022-06-08T14:59:40Z) - Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images.
We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - PDC: Piecewise Depth Completion utilizing Superpixels [0.0]
Current approaches often rely on CNN-based methods with several known drawbacks.
We propose our novel Piecewise Depth Completion (PDC), which works completely without deep learning.
In our evaluation, we can show both the influence of the individual proposed processing steps and the overall performance of our method on the challenging KITTI dataset.
arXiv Detail & Related papers (2021-07-14T13:58:39Z) - Data-Level Recombination and Lightweight Fusion Scheme for RGB-D Salient
Object Detection [73.31632581915201]
We propose a novel data-level recombination strategy to fuse RGB with D (depth) before deep feature extraction.
A newly lightweight designed triple-stream network is applied over these novel formulated data to achieve an optimal channel-wise complementary fusion status between the RGB and D.
arXiv Detail & Related papers (2020-08-07T10:13:05Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Is Depth Really Necessary for Salient Object Detection? [50.10888549190576]
We make the first attempt in realizing an unified depth-aware framework with only RGB information as input for inference.
Not only surpasses the state-of-the-art performances on five public RGB SOD benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a large margin.
arXiv Detail & Related papers (2020-05-30T13:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.