All in One: RGB, RGB-D, and RGB-T Salient Object Detection
- URL: http://arxiv.org/abs/2311.14746v1
- Date: Thu, 23 Nov 2023 03:34:41 GMT
- Title: All in One: RGB, RGB-D, and RGB-T Salient Object Detection
- Authors: Xingzhao Jia, Zhongqiu Zhao, Changlei Dongye, and Zhao Zhang
- Abstract summary: Aient object detection (SOD) aims to identify the most attractive objects within an image.
Previous researches have focused on saliency detection with individual data type.
We propose an innovative model framework that provides a unified solution for the salient object detection task of three types of data.
- Score: 6.417439550842723
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Salient object detection (SOD) aims to identify the most attractive objects
within an image. Depending on the type of data being detected, SOD can be
categorized into various forms, including RGB, RGB-D (Depth), RGB-T (Thermal)
and light field SOD. Previous researches have focused on saliency detection
with individual data type. If the RGB-D SOD model is forced to detect RGB-T
data it will perform poorly. We propose an innovative model framework that
provides a unified solution for the salient object detection task of three
types of data (RGB, RGB-D, and RGB-T). The three types of data can be handled
in one model (all in one) with the same weight parameters. In this framework,
the three types of data are concatenated in an ordered manner within a single
input batch, and features are extracted using a transformer network. Based on
this framework, we propose an efficient lightweight SOD model, namely AiOSOD,
which can detect any RGB, RGB-D, and RGB-T data with high speed (780FPS for RGB
data, 485FPS for RGB-D or RGB-T data). Notably, with only 6.25M parameters,
AiOSOD achieves excellent performance on RGB, RGB-D, and RGB-T datasets.
Related papers
- RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker [4.235252053339947]
This paper introduces a new challenging RGB-Sonar (RGB-S) tracking task.
It investigates how to achieve efficient tracking of an underwater target through the interaction of RGB and sonar modalities.
arXiv Detail & Related papers (2024-06-11T12:01:11Z) - DFormer: Rethinking RGBD Representation Learning for Semantic
Segmentation [76.81628995237058]
DFormer is a novel framework to learn transferable representations for RGB-D segmentation tasks.
It pretrains the backbone using image-depth pairs from ImageNet-1K.
DFormer achieves new state-of-the-art performance on two popular RGB-D tasks.
arXiv Detail & Related papers (2023-09-18T11:09:11Z) - RGBD1K: A Large-scale Dataset and Benchmark for RGB-D Object Tracking [30.448658049744775]
Given a limited amount of annotated RGB-D tracking data, most state-of-the-art RGB-D trackers are simple extensions of high-performance RGB-only trackers.
To address the dataset deficiency issue, a new RGB-D dataset named RGBD1K is released in this paper.
arXiv Detail & Related papers (2022-08-21T03:07:36Z) - Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images.
We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z) - DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency
Detection [104.50425501764806]
We introduce a large-scale dataset to enable versatile applications for light field saliency detection.
We present an asymmetrical two-stream model consisting of the Focal stream and RGB stream.
Experiments demonstrate that our Focal stream achieves state-of-the-arts performance.
arXiv Detail & Related papers (2020-12-30T11:53:27Z) - Data-Level Recombination and Lightweight Fusion Scheme for RGB-D Salient
Object Detection [73.31632581915201]
We propose a novel data-level recombination strategy to fuse RGB with D (depth) before deep feature extraction.
A newly lightweight designed triple-stream network is applied over these novel formulated data to achieve an optimal channel-wise complementary fusion status between the RGB and D.
arXiv Detail & Related papers (2020-08-07T10:13:05Z) - Synergistic saliency and depth prediction for RGB-D saliency detection [76.27406945671379]
Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios.
We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
arXiv Detail & Related papers (2020-07-03T14:24:41Z) - Is Depth Really Necessary for Salient Object Detection? [50.10888549190576]
We make the first attempt in realizing an unified depth-aware framework with only RGB information as input for inference.
Not only surpasses the state-of-the-art performances on five public RGB SOD benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a large margin.
arXiv Detail & Related papers (2020-05-30T13:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.