Specificity-preserving RGB-D Saliency Detection
- URL: http://arxiv.org/abs/2108.08162v1
- Date: Wed, 18 Aug 2021 14:14:22 GMT
- Title: Specificity-preserving RGB-D Saliency Detection
- Authors: Tao Zhou, Huazhu Fu, Geng Chen, Yi Zhou, Deng-Ping Fan, Ling Shao
- Abstract summary: We propose a specificity-preserving network (SP-Net) for RGB-D saliency detection.
Two modality-specific networks and a shared learning network are adopted to generate individual and shared saliency maps.
Experiments on six benchmark datasets demonstrate that our SP-Net outperforms other state-of-the-art methods.
- Score: 103.3722116992476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: RGB-D saliency detection has attracted increasing attention, due to its
effectiveness and the fact that depth cues can now be conveniently captured.
Existing works often focus on learning a shared representation through various
fusion strategies, with few methods explicitly considering how to preserve
modality-specific characteristics. In this paper, taking a new perspective, we
propose a specificity-preserving network (SP-Net) for RGB-D saliency detection,
which benefits saliency detection performance by exploring both the shared
information and modality-specific properties (e.g., specificity). Specifically,
two modality-specific networks and a shared learning network are adopted to
generate individual and shared saliency maps. A cross-enhanced integration
module (CIM) is proposed to fuse cross-modal features in the shared learning
network, which are then propagated to the next layer for integrating
cross-level information. Besides, we propose a multi-modal feature aggregation
(MFA) module to integrate the modality-specific features from each individual
decoder into the shared decoder, which can provide rich complementary
multi-modal information to boost the saliency detection performance. Further, a
skip connection is used to combine hierarchical features between the encoder
and decoder layers. Experiments on six benchmark datasets demonstrate that our
SP-Net outperforms other state-of-the-art methods. Code is available at:
https://github.com/taozh2017/SPNet.
Related papers
- Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.
Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies.
Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z) - Feature Aggregation and Propagation Network for Camouflaged Object
Detection [42.33180748293329]
Camouflaged object detection (COD) aims to detect/segment camouflaged objects embedded in the environment.
Several COD methods have been developed, but they still suffer from unsatisfactory performance due to intrinsic similarities between foreground objects and background surroundings.
We propose a novel Feature Aggregation and propagation Network (FAP-Net) for camouflaged object detection.
arXiv Detail & Related papers (2022-12-02T05:54:28Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - M2RNet: Multi-modal and Multi-scale Refined Network for RGB-D Salient
Object Detection [1.002712867721496]
Methods based on RGB-D often suffer from the incompatibility of multi-modal feature fusion and the insufficiency of multi-scale feature aggregation.
We propose a novel multi-modal and multi-scale refined network (M2RNet)
Three essential components are presented in this network.
arXiv Detail & Related papers (2021-09-16T12:15:40Z) - M2IOSR: Maximal Mutual Information Open Set Recognition [47.1393314282815]
We propose a mutual information-based method with a streamlined architecture for open set recognition.
The proposed method significantly improves the performance of baselines and achieves new state-of-the-art results on several benchmarks consistently.
arXiv Detail & Related papers (2021-08-05T05:08:12Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - Deep feature selection-and-fusion for RGB-D semantic segmentation [8.831857715361624]
This work proposes a unified and efficient feature selectionand-fusion network (FSFNet)
FSFNet contains a symmetric cross-modality residual fusion module used for explicit fusion of multi-modality information.
Compared with the state-of-the-art methods, experimental evaluations demonstrate that the proposed model achieves competitive performance on two public datasets.
arXiv Detail & Related papers (2021-05-10T04:02:32Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - RGB-D Salient Object Detection with Cross-Modality Modulation and
Selection [126.4462739820643]
We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD)
The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features.
arXiv Detail & Related papers (2020-07-14T14:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.