Multi-level Cross-modal Interaction Network for RGB-D Salient Object
Detection
- URL: http://arxiv.org/abs/2007.14352v2
- Date: Mon, 8 Mar 2021 08:38:34 GMT
- Title: Multi-level Cross-modal Interaction Network for RGB-D Salient Object
Detection
- Authors: Zhou Huang, Huai-Xin Chen, Tao Zhou, Yun-Zhi Yang and Bi-Yuan Liu
- Abstract summary: We propose a novel Multi-level Cross-modal Interaction Network (MCINet) for RGB-D based salient object detection (SOD)
Our MCI-Net includes two key components: 1) a cross-modal feature learning network, which is used to learn the high-level features for the RGB images and depth cues, effectively enabling the correlations between the two sources to be exploited; and 2) a multi-level interactive integration network, which integrates multi-level cross-modal features to boost the SOD performance.
- Score: 3.581367375462018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth cues with affluent spatial information have been proven beneficial in
boosting salient object detection (SOD), while the depth quality directly
affects the subsequent SOD performance. However, it is inevitable to obtain
some low-quality depth cues due to limitations of its acquisition devices,
which can inhibit the SOD performance. Besides, existing methods tend to
combine RGB images and depth cues in a direct fusion or a simple fusion module,
which makes they can not effectively exploit the complex correlations between
the two sources. Moreover, few methods design an appropriate module to fully
fuse multi-level features, resulting in cross-level feature interaction
insufficient. To address these issues, we propose a novel Multi-level
Cross-modal Interaction Network (MCINet) for RGB-D based SOD. Our MCI-Net
includes two key components: 1) a cross-modal feature learning network, which
is used to learn the high-level features for the RGB images and depth cues,
effectively enabling the correlations between the two sources to be exploited;
and 2) a multi-level interactive integration network, which integrates
multi-level cross-modal features to boost the SOD performance. Extensive
experiments on six benchmark datasets demonstrate the superiority of our
MCI-Net over 14 state-of-the-art methods, and validate the effectiveness of
different components in our MCI-Net. More important, our MCI-Net significantly
improves the SOD performance as well as has a higher FPS.
Related papers
- Point-aware Interaction and CNN-induced Refinement Network for RGB-D
Salient Object Detection [95.84616822805664]
We introduce CNNs-assisted Transformer architecture and propose a novel RGB-D SOD network with Point-aware Interaction and CNN-induced Refinement.
In order to alleviate the block effect and detail destruction problems brought by the Transformer naturally, we design a CNN-induced refinement (CNNR) unit for content refinement and supplementation.
arXiv Detail & Related papers (2023-08-17T11:57:49Z) - HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.
Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies.
Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z) - Interactive Context-Aware Network for RGB-T Salient Object Detection [7.544240329265388]
We propose a novel network called Interactive Context-Aware Network (ICANet)
ICANet contains three modules that can effectively perform the cross-modal and cross-scale fusions.
Experiments prove that our network performs favorably against the state-of-the-art RGB-T SOD methods.
arXiv Detail & Related papers (2022-11-11T10:04:36Z) - CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient
Object Detection [144.66411561224507]
We present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement.
Our network outperforms the state-of-the-art saliency detectors both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-10-06T11:59:19Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - M2RNet: Multi-modal and Multi-scale Refined Network for RGB-D Salient
Object Detection [1.002712867721496]
Methods based on RGB-D often suffer from the incompatibility of multi-modal feature fusion and the insufficiency of multi-scale feature aggregation.
We propose a novel multi-modal and multi-scale refined network (M2RNet)
Three essential components are presented in this network.
arXiv Detail & Related papers (2021-09-16T12:15:40Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - Siamese Network for RGB-D Salient Object Detection and Beyond [113.30063105890041]
A novel framework is proposed to learn from both RGB and depth inputs through a shared network backbone.
Comprehensive experiments using five popular metrics show that the designed framework yields a robust RGB-D saliency detector.
We also link JL-DCF to the RGB-D semantic segmentation field, showing its capability of outperforming several semantic segmentation models.
arXiv Detail & Related papers (2020-08-26T06:01:05Z) - RGB-D Salient Object Detection with Cross-Modality Modulation and
Selection [126.4462739820643]
We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD)
The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features.
arXiv Detail & Related papers (2020-07-14T14:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.