Data-Level Recombination and Lightweight Fusion Scheme for RGB-D Salient
Object Detection
- URL: http://arxiv.org/abs/2009.05102v1
- Date: Fri, 7 Aug 2020 10:13:05 GMT
- Title: Data-Level Recombination and Lightweight Fusion Scheme for RGB-D Salient
Object Detection
- Authors: Xuehao Wang, Shuai Li, Chenglizhao Chen, Yuming Fang, Aimin Hao, Hong
Qin
- Abstract summary: We propose a novel data-level recombination strategy to fuse RGB with D (depth) before deep feature extraction.
A newly lightweight designed triple-stream network is applied over these novel formulated data to achieve an optimal channel-wise complementary fusion status between the RGB and D.
- Score: 73.31632581915201
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing RGB-D salient object detection methods treat depth information as an
independent component to complement its RGB part, and widely follow the
bi-stream parallel network architecture. To selectively fuse the CNNs features
extracted from both RGB and depth as a final result, the state-of-the-art
(SOTA) bi-stream networks usually consist of two independent subbranches; i.e.,
one subbranch is used for RGB saliency and the other aims for depth saliency.
However, its depth saliency is persistently inferior to the RGB saliency
because the RGB component is intrinsically more informative than the depth
component. The bi-stream architecture easily biases its subsequent fusion
procedure to the RGB subbranch, leading to a performance bottleneck. In this
paper, we propose a novel data-level recombination strategy to fuse RGB with D
(depth) before deep feature extraction, where we cyclically convert the
original 4-dimensional RGB-D into \textbf{D}GB, R\textbf{D}B and RG\textbf{D}.
Then, a newly lightweight designed triple-stream network is applied over these
novel formulated data to achieve an optimal channel-wise complementary fusion
status between the RGB and D, achieving a new SOTA performance.
Related papers
- Pyramidal Attention for Saliency Detection [30.554118525502115]
This paper exploits only RGB images, estimates depth from RGB, and leverages the intermediate depth features.
We employ a pyramidal attention structure to extract multi-level convolutional-transformer features to process initial stage representations.
We report significantly improved performance against 21 and 40 state-of-the-art SOD methods on eight RGB and RGB-D datasets.
arXiv Detail & Related papers (2022-04-14T06:57:46Z) - Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images.
We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Cross-Modal Weighting Network for RGB-D Salient Object Detection [76.0965123893641]
We propose a novel Cross-Modal Weighting (CMW) strategy to encourage comprehensive interactions between RGB and depth channels for RGB-D SOD.
Specifically, three RGB-depth interaction modules, named CMW-L, CMW-M and CMW-H, are developed to deal with respectively low-, middle- and high-level cross-modal information fusion.
CMWNet consistently outperforms 15 state-of-the-art RGB-D SOD methods on seven popular benchmarks.
arXiv Detail & Related papers (2020-07-09T16:01:44Z) - Synergistic saliency and depth prediction for RGB-D saliency detection [76.27406945671379]
Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios.
We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
arXiv Detail & Related papers (2020-07-03T14:24:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.