DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency
Detection
- URL: http://arxiv.org/abs/2012.15124v1
- Date: Wed, 30 Dec 2020 11:53:27 GMT
- Title: DUT-LFSaliency: Versatile Dataset and Light Field-to-RGB Saliency
Detection
- Authors: Yongri Piao and Zhengkun Rong and Shuang Xu and Miao Zhang and Huchuan
Lu
- Abstract summary: We introduce a large-scale dataset to enable versatile applications for light field saliency detection.
We present an asymmetrical two-stream model consisting of the Focal stream and RGB stream.
Experiments demonstrate that our Focal stream achieves state-of-the-arts performance.
- Score: 104.50425501764806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Light field data exhibit favorable characteristics conducive to saliency
detection. The success of learning-based light field saliency detection is
heavily dependent on how a comprehensive dataset can be constructed for higher
generalizability of models, how high dimensional light field data can be
effectively exploited, and how a flexible model can be designed to achieve
versatility for desktop computers and mobile devices. To answer these
questions, first we introduce a large-scale dataset to enable versatile
applications for RGB, RGB-D and light field saliency detection, containing 102
classes and 4204 samples. Second, we present an asymmetrical two-stream model
consisting of the Focal stream and RGB stream. The Focal stream is designed to
achieve higher performance on desktop computers and transfer focusness
knowledge to the RGB stream, relying on two tailor-made modules. The RGB stream
guarantees the flexibility and memory/computation efficiency on mobile devices
through three distillation schemes. Experiments demonstrate that our Focal
stream achieves state-of-the-arts performance. The RGB stream achieves Top-2
F-measure on DUTLF-V2, which tremendously minimizes the model size by 83% and
boosts FPS by 5 times, compared with the best performing method. Furthermore,
our proposed distillation schemes are applicable to RGB saliency models,
achieving impressive performance gains while ensuring flexibility.
Related papers
- Bringing RGB and IR Together: Hierarchical Multi-Modal Enhancement for Robust Transmission Line Detection [67.02804741856512]
We propose a novel Hierarchical Multi-Modal Enhancement Network (HMMEN) that integrates RGB and IR data for robust and accurate TL detection.
Our method introduces two key components: (1) a Mutual Multi-Modal Enhanced Block (MMEB), which fuses and enhances hierarchical RGB and IR feature maps in a coarse-to-fine manner, and (2) a Feature Alignment Block (FAB) that corrects misalignments between decoder outputs and IR feature maps by leveraging deformable convolutions.
arXiv Detail & Related papers (2025-01-25T06:21:06Z) - Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection [8.607385112274882]
Deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images.
Existing deep learning-based RGB-T SOD models suffer from two major limitations.
We propose a purely Fourier transform-based model, namely Deep Fourier-Embedded Network (DFENet) for accurate RGB-T SOD.
arXiv Detail & Related papers (2024-11-27T14:55:16Z) - TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking [30.89375068036783]
Existing approaches perform event feature extraction for RGB-E tracking using traditional appearance models.
We propose an Event backbone (Pooler) to obtain a high-quality feature representation that is cognisant of the intrinsic characteristics of the event data.
Our method significantly outperforms state-of-the-art trackers on two widely used RGB-E tracking datasets.
arXiv Detail & Related papers (2024-05-08T12:19:08Z) - Salient Object Detection in RGB-D Videos [11.805682025734551]
This paper makes two primary contributions: the dataset and the model.
We construct the RDVS dataset, a new RGB-D VSOD dataset with realistic depth.
We introduce DCTNet+, a three-stream network tailored for RGB-D VSOD.
arXiv Detail & Related papers (2023-10-24T03:18:07Z) - Middle-level Fusion for Lightweight RGB-D Salient Object Detection [81.43951906434175]
A novel lightweight RGB-D SOD model is presented in this paper.
With IMFF and L modules incorporated in the middle-level fusion structure, our proposed model has only 3.9M parameters and runs at 33 FPS.
The experimental results on several benchmark datasets verify the effectiveness and superiority of the proposed method over some state-of-the-art methods.
arXiv Detail & Related papers (2021-04-23T11:37:15Z) - Siamese Network for RGB-D Salient Object Detection and Beyond [113.30063105890041]
A novel framework is proposed to learn from both RGB and depth inputs through a shared network backbone.
Comprehensive experiments using five popular metrics show that the designed framework yields a robust RGB-D saliency detector.
We also link JL-DCF to the RGB-D semantic segmentation field, showing its capability of outperforming several semantic segmentation models.
arXiv Detail & Related papers (2020-08-26T06:01:05Z) - A Single Stream Network for Robust and Real-time RGB-D Salient Object
Detection [89.88222217065858]
We design a single stream network to use the depth map to guide early fusion and middle fusion between RGB and depth.
This model is 55.5% lighter than the current lightest model and runs at a real-time speed of 32 FPS when processing a $384 times 384$ image.
arXiv Detail & Related papers (2020-07-14T04:40:14Z) - Synergistic saliency and depth prediction for RGB-D saliency detection [76.27406945671379]
Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios.
We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
arXiv Detail & Related papers (2020-07-03T14:24:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.