Synergistic saliency and depth prediction for RGB-D saliency detection
- URL: http://arxiv.org/abs/2007.01711v2
- Date: Mon, 26 Oct 2020 06:23:18 GMT
- Title: Synergistic saliency and depth prediction for RGB-D saliency detection
- Authors: Yue Wang, Yuke Li, James H. Elder, Huchuan Lu, Runmin Wu, Lu Zhang
- Abstract summary: Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios.
We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
- Score: 76.27406945671379
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth information available from an RGB-D camera can be useful in segmenting
salient objects when figure/ground cues from RGB channels are weak. This has
motivated the development of several RGB-D saliency datasets and algorithms
that use all four channels of the RGB-D data for both training and inference.
Unfortunately, existing RGB-D saliency datasets are small, which may lead to
overfitting and limited generalization for diverse scenarios. Here we propose a
semi-supervised system for RGB-D saliency detection that can be trained on
smaller RGB-D saliency datasets without saliency ground truth, while also make
effective joint use of a large RGB saliency dataset with saliency ground truth
together. To generalize our method on RGB-D saliency datasets, a novel
prediction-guided cross-refinement module which jointly estimates both saliency
and depth by mutual refinement between two respective tasks, and an adversarial
learning approach are employed. Critically, our system does not require
saliency ground-truth for the RGB-D datasets, which saves the massive human
labor for hand labeling, and does not require the depth data for inference,
allowing the method to be used for the much broader range of applications where
only RGB data are available. Evaluation on seven RGB-D datasets demonstrates
that even without saliency ground truth for RGB-D datasets and using only the
RGB data of RGB-D datasets at inference, our semi-supervised system performs
favorable against state-of-the-art fully-supervised RGB-D saliency detection
methods that use saliency ground truth for RGB-D datasets at training and depth
data at inference on two largest testing datasets. Our approach also achieves
comparable results on other popular RGB-D saliency benchmarks.
Related papers
- RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker [4.235252053339947]
This paper introduces a new challenging RGB-Sonar (RGB-S) tracking task.
It investigates how to achieve efficient tracking of an underwater target through the interaction of RGB and sonar modalities.
arXiv Detail & Related papers (2024-06-11T12:01:11Z) - All in One: RGB, RGB-D, and RGB-T Salient Object Detection [6.417439550842723]
Aient object detection (SOD) aims to identify the most attractive objects within an image.
Previous researches have focused on saliency detection with individual data type.
We propose an innovative model framework that provides a unified solution for the salient object detection task of three types of data.
arXiv Detail & Related papers (2023-11-23T03:34:41Z) - DFormer: Rethinking RGBD Representation Learning for Semantic
Segmentation [76.81628995237058]
DFormer is a novel framework to learn transferable representations for RGB-D segmentation tasks.
It pretrains the backbone using image-depth pairs from ImageNet-1K.
DFormer achieves new state-of-the-art performance on two popular RGB-D tasks.
arXiv Detail & Related papers (2023-09-18T11:09:11Z) - RGBD1K: A Large-scale Dataset and Benchmark for RGB-D Object Tracking [30.448658049744775]
Given a limited amount of annotated RGB-D tracking data, most state-of-the-art RGB-D trackers are simple extensions of high-performance RGB-only trackers.
To address the dataset deficiency issue, a new RGB-D dataset named RGBD1K is released in this paper.
arXiv Detail & Related papers (2022-08-21T03:07:36Z) - Pyramidal Attention for Saliency Detection [30.554118525502115]
This paper exploits only RGB images, estimates depth from RGB, and leverages the intermediate depth features.
We employ a pyramidal attention structure to extract multi-level convolutional-transformer features to process initial stage representations.
We report significantly improved performance against 21 and 40 state-of-the-art SOD methods on eight RGB and RGB-D datasets.
arXiv Detail & Related papers (2022-04-14T06:57:46Z) - Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images.
We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z) - Data-Level Recombination and Lightweight Fusion Scheme for RGB-D Salient
Object Detection [73.31632581915201]
We propose a novel data-level recombination strategy to fuse RGB with D (depth) before deep feature extraction.
A newly lightweight designed triple-stream network is applied over these novel formulated data to achieve an optimal channel-wise complementary fusion status between the RGB and D.
arXiv Detail & Related papers (2020-08-07T10:13:05Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Is Depth Really Necessary for Salient Object Detection? [50.10888549190576]
We make the first attempt in realizing an unified depth-aware framework with only RGB information as input for inference.
Not only surpasses the state-of-the-art performances on five public RGB SOD benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a large margin.
arXiv Detail & Related papers (2020-05-30T13:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.