Towards Accurate RGB-D Saliency Detection with Complementary Attention
and Adaptive Integration
- URL: http://arxiv.org/abs/2102.04046v1
- Date: Mon, 8 Feb 2021 08:08:30 GMT
- Title: Towards Accurate RGB-D Saliency Detection with Complementary Attention
and Adaptive Integration
- Authors: Hong-Bo Bi, Zi-Qi Liu, Kang Wang, Bo Dong, Geng Chen, Ji-Quan Ma
- Abstract summary: Saliency detection based on the complementary information from RGB images and depth maps has recently gained great popularity.
We propose Complementary Attention and Adaptive Integration Network (CAAI-Net) to integrate complementary attention based feature concentration and adaptive cross-modal feature fusion.
CAAI-Net is an effective saliency detection model and outperforms nine state-of-the-art models in terms of four widely-used metrics.
- Score: 20.006932559837516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Saliency detection based on the complementary information from RGB images and
depth maps has recently gained great popularity. In this paper, we propose
Complementary Attention and Adaptive Integration Network (CAAI-Net), a novel
RGB-D saliency detection model that integrates complementary attention based
feature concentration and adaptive cross-modal feature fusion into a unified
framework for accurate saliency detection. Specifically, we propose a
context-aware complementary attention (CCA) module, which consists of a feature
interaction component, a complementary attention component, and a
global-context component. The CCA module first utilizes the feature interaction
component to extract rich local context features. The resulting features are
then fed into the complementary attention component, which employs the
complementary attention generated from adjacent levels to guide the attention
at the current layer so that the mutual background disturbances are suppressed
and the network focuses more on the areas with salient objects. Finally, we
utilize a specially-designed adaptive feature integration (AFI) module, which
sufficiently considers the low-quality issue of depth maps, to aggregate the
RGB and depth features in an adaptive manner. Extensive experiments on six
challenging benchmark datasets demonstrate that CAAI-Net is an effective
saliency detection model and outperforms nine state-of-the-art models in terms
of four widely-used metrics. In addition, extensive ablation studies confirm
the effectiveness of the proposed CCA and AFI modules.
Related papers
- PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - Single-Point Supervised High-Resolution Dynamic Network for Infrared Small Target Detection [7.0456782736205685]
We propose a single-point supervised high-resolution dynamic network (SSHD-Net)
It achieves state-of-the-art (SOTA) detection performance using only single-point supervision.
Experiments on the publicly available datasets NUDT-SIRST and IRSTD-1k demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-08-04T09:44:47Z) - Point-aware Interaction and CNN-induced Refinement Network for RGB-D
Salient Object Detection [95.84616822805664]
We introduce CNNs-assisted Transformer architecture and propose a novel RGB-D SOD network with Point-aware Interaction and CNN-induced Refinement.
In order to alleviate the block effect and detail destruction problems brought by the Transformer naturally, we design a CNN-induced refinement (CNNR) unit for content refinement and supplementation.
arXiv Detail & Related papers (2023-08-17T11:57:49Z) - CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient
Object Detection [144.66411561224507]
We present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement.
Our network outperforms the state-of-the-art saliency detectors both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-10-06T11:59:19Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - RGB-D Salient Object Detection with Cross-Modality Modulation and
Selection [126.4462739820643]
We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD)
The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features.
arXiv Detail & Related papers (2020-07-14T14:22:50Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - Hybrid Multiple Attention Network for Semantic Segmentation in Aerial
Images [24.35779077001839]
We propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations.
We introduce a simple yet effective region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism.
arXiv Detail & Related papers (2020-01-09T07:47:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.