Is Depth Really Necessary for Salient Object Detection?
- URL: http://arxiv.org/abs/2006.00269v2
- Date: Tue, 2 Jun 2020 01:07:49 GMT
- Title: Is Depth Really Necessary for Salient Object Detection?
- Authors: Jiawei Zhao, Yifan Zhao, Jia Li, Xiaowu Chen
- Abstract summary: We make the first attempt in realizing an unified depth-aware framework with only RGB information as input for inference.
Not only surpasses the state-of-the-art performances on five public RGB SOD benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a large margin.
- Score: 50.10888549190576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Salient object detection (SOD) is a crucial and preliminary task for many
computer vision applications, which have made progress with deep CNNs. Most of
the existing methods mainly rely on the RGB information to distinguish the
salient objects, which faces difficulties in some complex scenarios. To solve
this, many recent RGBD-based networks are proposed by adopting the depth map as
an independent input and fuse the features with RGB information. Taking the
advantages of RGB and RGBD methods, we propose a novel depth-aware salient
object detection framework, which has following superior designs: 1) It only
takes the depth information as training data while only relies on RGB
information in the testing phase. 2) It comprehensively optimizes SOD features
with multi-level depth-aware regularizations. 3) The depth information also
serves as error-weighted map to correct the segmentation process. With these
insightful designs combined, we make the first attempt in realizing an unified
depth-aware framework with only RGB information as input for inference, which
not only surpasses the state-of-the-art performances on five public RGB SOD
benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a
large margin, while adopting less information and implementation
light-weighted. The code and model will be publicly available.
Related papers
- Depth-based Privileged Information for Boosting 3D Human Pose Estimation on RGB [48.31210455404533]
Heatmap-based 3D pose estimator is able to hallucinate depth information from the RGB frames given at inference time.
depth information is used exclusively during training by enforcing our RGB-based hallucination network to learn similar features to a backbone pre-trained only on depth data.
arXiv Detail & Related papers (2024-09-17T11:59:34Z) - DFormer: Rethinking RGBD Representation Learning for Semantic
Segmentation [76.81628995237058]
DFormer is a novel framework to learn transferable representations for RGB-D segmentation tasks.
It pretrains the backbone using image-depth pairs from ImageNet-1K.
DFormer achieves new state-of-the-art performance on two popular RGB-D tasks.
arXiv Detail & Related papers (2023-09-18T11:09:11Z) - Pyramidal Attention for Saliency Detection [30.554118525502115]
This paper exploits only RGB images, estimates depth from RGB, and leverages the intermediate depth features.
We employ a pyramidal attention structure to extract multi-level convolutional-transformer features to process initial stage representations.
We report significantly improved performance against 21 and 40 state-of-the-art SOD methods on eight RGB and RGB-D datasets.
arXiv Detail & Related papers (2022-04-14T06:57:46Z) - Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images.
We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z) - Modality-Guided Subnetwork for Salient Object Detection [5.491692465987937]
Most RGBD networks require multi-modalities from the input side and feed them separately through a two-stream design.
We present in this paper a novel fusion design named modality-guided subnetwork (MGSnet)
It has the following superior designs: 1) Our model works for both RGB and RGBD data, and dynamically estimating depth if not available.
arXiv Detail & Related papers (2021-10-10T20:59:11Z) - RGB-D Salient Object Detection with Ubiquitous Target Awareness [37.6726410843724]
We make the first attempt to solve the RGB-D salient object detection problem with a novel depth-awareness framework.
We propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task.
Our proposed UTA network is depth-free for inference and runs in real-time with 43 FPS.
arXiv Detail & Related papers (2021-09-08T04:27:29Z) - Cross-modality Discrepant Interaction Network for RGB-D Salient Object
Detection [78.47767202232298]
We propose a novel Cross-modality Discrepant Interaction Network (CDINet) for RGB-D SOD.
Two components are designed to implement the effective cross-modality interaction.
Our network outperforms $15$ state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2021-08-04T11:24:42Z) - MobileSal: Extremely Efficient RGB-D Salient Object Detection [62.04876251927581]
This paper introduces a novel network, methodname, which focuses on efficient RGB-D salient object detection (SOD)
We propose an implicit depth restoration (IDR) technique to strengthen the feature representation capability of mobile networks for RGB-D SOD.
With IDR and CPR incorporated, methodnameperforms favorably against sArt methods on seven challenging RGB-D SOD datasets.
arXiv Detail & Related papers (2020-12-24T04:36:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.