DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency
Detection
- URL: http://arxiv.org/abs/2012.15124v1
- Date: Wed, 30 Dec 2020 11:53:27 GMT
- Title: DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency
Detection
- Authors: Yongri Piao and Zhengkun Rong and Shuang Xu and Miao Zhang and Huchuan
Lu
- Abstract summary: We introduce a large-scale dataset to enable versatile applications for light field saliency detection.
We present an asymmetrical two-stream model consisting of the Focal stream and RGB stream.
Experiments demonstrate that our Focal stream achieves state-of-the-arts performance.
- Score: 104.50425501764806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Light field data exhibit favorable characteristics conducive to saliency
detection. The success of learning-based light field saliency detection is
heavily dependent on how a comprehensive dataset can be constructed for higher
generalizability of models, how high dimensional light field data can be
effectively exploited, and how a flexible model can be designed to achieve
versatility for desktop computers and mobile devices. To answer these
questions, first we introduce a large-scale dataset to enable versatile
applications for RGB, RGB-D and light field saliency detection, containing 102
classes and 4204 samples. Second, we present an asymmetrical two-stream model
consisting of the Focal stream and RGB stream. The Focal stream is designed to
achieve higher performance on desktop computers and transfer focusness
knowledge to the RGB stream, relying on two tailor-made modules. The RGB stream
guarantees the flexibility and memory/computation efficiency on mobile devices
through three distillation schemes. Experiments demonstrate that our Focal
stream achieves state-of-the-arts performance. The RGB stream achieves Top-2
F-measure on DUTLF-V2, which tremendously minimizes the model size by 83% and
boosts FPS by 5 times, compared with the best performing method. Furthermore,
our proposed distillation schemes are applicable to RGB saliency models,
achieving impressive performance gains while ensuring flexibility.
Related papers
- TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking [30.89375068036783]
Existing approaches perform event feature extraction for RGB-E tracking using traditional appearance models.
We propose an Event backbone (Pooler) to obtain a high-quality feature representation that is cognisant of the intrinsic characteristics of the event data.
Our method significantly outperforms state-of-the-art trackers on two widely used RGB-E tracking datasets.
arXiv Detail & Related papers (2024-05-08T12:19:08Z) - Salient Object Detection in RGB-D Videos [11.805682025734551]
This paper makes two primary contributions: the dataset and the model.
We construct the RDVS dataset, a new RGB-D VSOD dataset with realistic depth.
We introduce DCTNet+, a three-stream network tailored for RGB-D VSOD.
arXiv Detail & Related papers (2023-10-24T03:18:07Z) - RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical
Flow and Scene Flow Estimation [43.358140897849616]
In this paper, we incorporate RGB images, Point clouds and Events for joint optical flow and scene flow estimation with our proposed multi-stage multimodal fusion model, RPEFlow.
Experiments on both synthetic and real datasets show that our model outperforms the existing state-of-the-art by a wide margin.
arXiv Detail & Related papers (2023-09-26T17:23:55Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Middle-level Fusion for Lightweight RGB-D Salient Object Detection [81.43951906434175]
A novel lightweight RGB-D SOD model is presented in this paper.
With IMFF and L modules incorporated in the middle-level fusion structure, our proposed model has only 3.9M parameters and runs at 33 FPS.
The experimental results on several benchmark datasets verify the effectiveness and superiority of the proposed method over some state-of-the-art methods.
arXiv Detail & Related papers (2021-04-23T11:37:15Z) - Siamese Network for RGB-D Salient Object Detection and Beyond [113.30063105890041]
A novel framework is proposed to learn from both RGB and depth inputs through a shared network backbone.
Comprehensive experiments using five popular metrics show that the designed framework yields a robust RGB-D saliency detector.
We also link JL-DCF to the RGB-D semantic segmentation field, showing its capability of outperforming several semantic segmentation models.
arXiv Detail & Related papers (2020-08-26T06:01:05Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z) - Synergistic saliency and depth prediction for RGB-D saliency detection [76.27406945671379]
Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios.
We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
arXiv Detail & Related papers (2020-07-03T14:24:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.