Related papers: BTS-Net: Bi-directional Transfer-and-Selection Network For RGB-D Salient Object Detection

BTS-Net: Bi-directional Transfer-and-Selection Network For RGB-D Salient Object Detection

URL: http://arxiv.org/abs/2104.01784v1
Date: Mon, 5 Apr 2021 05:58:43 GMT
Title: BTS-Net: Bi-directional Transfer-and-Selection Network For RGB-D Salient Object Detection
Authors: Wenbo Zhang, Yao Jiang, Keren Fu, Qijun Zhao
Abstract summary: depth maps obtained from RGB-D salient object detection often suffer from low quality and inaccuracy. Most existing RGB-D SOD models have no cross-modal interactions or only have unidirectional interactions from depth to RGB in their encoder stages. We propose a novel bi-directional transfer-and-selection network named BTS-Net, which adopts a set of bi-directional transfer-and-selection modules to purify features during encoding.
Score: 16.87553302005972
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depth information has been proved beneficial in RGB-D salient object detection (SOD). However, depth maps obtained often suffer from low quality and inaccuracy. Most existing RGB-D SOD models have no cross-modal interactions or only have unidirectional interactions from depth to RGB in their encoder stages, which may lead to inaccurate encoder features when facing low quality depth. To address this limitation, we propose to conduct progressive bi-directional interactions as early in the encoder stage, yielding a novel bi-directional transfer-and-selection network named BTS-Net, which adopts a set of bi-directional transfer-and-selection (BTS) modules to purify features during encoding. Based on the resulting robust encoder features, we also design an effective light-weight group decoder to achieve accurate final saliency prediction. Comprehensive experiments on six widely used datasets demonstrate that BTS-Net surpasses 16 latest state-of-the-art approaches in terms of four key metrics.

Related papers

CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection [144.66411561224507]
We present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement. Our network outperforms the state-of-the-art saliency detectors both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-10-06T11:59:19Z)
Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network. We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs. Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z)
RGB-D Salient Object Detection with Ubiquitous Target Awareness [37.6726410843724]
We make the first attempt to solve the RGB-D salient object detection problem with a novel depth-awareness framework. We propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task. Our proposed UTA network is depth-free for inference and runs in real-time with 43 FPS.
arXiv Detail & Related papers (2021-09-08T04:27:29Z)
Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD. Two components are designed to implement the effective cross-modality interaction. Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z)
Data-Level Recombination and Lightweight Fusion Scheme for RGB-D Salient Object Detection [73.31632581915201]
We propose a novel data-level recombination strategy to fuse RGB with D (depth) before deep feature extraction. A newly lightweight designed triple-stream network is applied over these novel formulated data to achieve an optimal channel-wise complementary fusion status between the RGB and D.
arXiv Detail & Related papers (2020-08-07T10:13:05Z)
Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation. Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion. In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z)
Is Depth Really Necessary for Salient Object Detection? [50.10888549190576]
We make the first attempt in realizing an unified depth-aware framework with only RGB information as input for inference. Not only surpasses the state-of-the-art performances on five public RGB SOD benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a large margin.
arXiv Detail & Related papers (2020-05-30T13:40:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.