RGBT Salient Object Detection: A Large-scale Dataset and Benchmark
- URL: http://arxiv.org/abs/2007.03262v6
- Date: Mon, 23 May 2022 03:38:28 GMT
- Title: RGBT Salient Object Detection: A Large-scale Dataset and Benchmark
- Authors: Zhengzheng Tu, Yan Ma, Zhun Li, Chenglong Li, Jieming Xu, Yongtao Liu
- Abstract summary: Taking advantage of RGB and thermal infrared images becomes a new research direction for detecting salient object in complex scenes.
This work contributes such a RGBT image dataset named VT5000, including 5000 spatially aligned RGBT image pairs with ground truth annotations.
We propose a powerful baseline approach, which extracts multi-level features within each modality and aggregates these features of all modalities with the attention mechanism.
- Score: 12.14043884641457
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Salient object detection in complex scenes and environments is a challenging
research topic. Most works focus on RGB-based salient object detection, which
limits its performance of real-life applications when confronted with adverse
conditions such as dark environments and complex backgrounds. Taking advantage
of RGB and thermal infrared images becomes a new research direction for
detecting salient object in complex scenes recently, as thermal infrared
spectrum imaging provides the complementary information and has been applied to
many computer vision tasks. However, current research for RGBT salient object
detection is limited by the lack of a large-scale dataset and comprehensive
benchmark. This work contributes such a RGBT image dataset named VT5000,
including 5000 spatially aligned RGBT image pairs with ground truth
annotations. VT5000 has 11 challenges collected in different scenes and
environments for exploring the robustness of algorithms. With this dataset, we
propose a powerful baseline approach, which extracts multi-level features
within each modality and aggregates these features of all modalities with the
attention mechanism, for accurate RGBT salient object detection. Extensive
experiments show that the proposed baseline approach outperforms the
state-of-the-art methods on VT5000 dataset and other two public datasets. In
addition, we carry out a comprehensive analysis of different algorithms of RGBT
salient object detection on VT5000 dataset, and then make several valuable
conclusions and provide some potential research directions for RGBT salient
object detection.
Related papers
- SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection [20.12812979315803]
Object detection utilizing both visible (RGB) and thermal infrared (IR) imagery has garnered extensive attention.
Most existing multi-modal object detection methods directly input the RGB and IR images into deep neural networks.
We propose a novel coarse-to-fine perspective to purify and fuse features from both modalities.
arXiv Detail & Related papers (2024-01-19T14:49:42Z) - EANet: Enhanced Attribute-based RGBT Tracker Network [0.0]
We propose a deep learning-based image tracking approach that fuses RGB and thermal images (RGBT)
The proposed model consists of two main components: a feature extractor and a tracker.
The proposed methods are evaluated on the RGBT234 citeLiCLiang 2018 and LasHeR citeLiLasher 2021 datasets.
arXiv Detail & Related papers (2023-07-04T19:34:53Z) - Object Detection in Hyperspectral Image via Unified Spectral-Spatial
Feature Aggregation [55.9217962930169]
We present S2ADet, an object detector that harnesses the rich spectral and spatial complementary information inherent in hyperspectral images.
S2ADet surpasses existing state-of-the-art methods, achieving robust and reliable results.
arXiv Detail & Related papers (2023-06-14T09:01:50Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [80.13652104204691]
In this paper, we construct a large-scale benchmark with high diversity for visible-thermal UAV tracking (VTUAV)
We provide a coarse-to-fine attribute annotation, where frame-level attributes are provided to exploit the potential of challenge-specific trackers.
In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.
arXiv Detail & Related papers (2022-04-08T15:22:33Z) - Multi-Scale Iterative Refinement Network for RGB-D Salient Object
Detection [7.062058947498447]
salient visual cues appear in various scales and resolutions of RGB images due to semantic gaps at different feature levels.
Similar salient patterns are available in cross-modal depth images as well as multi-scale versions.
We devise attention based fusion module (ABF) to address on cross-modal correlation.
arXiv Detail & Related papers (2022-01-24T10:33:00Z) - RGB-D Salient Object Detection with Ubiquitous Target Awareness [37.6726410843724]
We make the first attempt to solve the RGB-D salient object detection problem with a novel depth-awareness framework.
We propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task.
Our proposed UTA network is depth-free for inference and runs in real-time with 43 FPS.
arXiv Detail & Related papers (2021-09-08T04:27:29Z) - FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in
High-Resolution Remote Sensing Imagery [21.9319970004788]
We propose a novel benchmark dataset with more than 1 million instances and more than 15,000 images for Fine-grAined object recognItion in high-Resolution remote sensing imagery.
All objects in the FAIR1M dataset are annotated with respect to 5 categories and 37 sub-categories by oriented bounding boxes.
arXiv Detail & Related papers (2021-03-09T17:20:15Z) - Is Depth Really Necessary for Salient Object Detection? [50.10888549190576]
We make the first attempt in realizing an unified depth-aware framework with only RGB information as input for inference.
Not only surpasses the state-of-the-art performances on five public RGB SOD benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a large margin.
arXiv Detail & Related papers (2020-05-30T13:40:03Z) - Drone-based RGB-Infrared Cross-Modality Vehicle Detection via
Uncertainty-Aware Learning [59.19469551774703]
Drone-based vehicle detection aims at finding the vehicle locations and categories in an aerial image.
We construct a large-scale drone-based RGB-Infrared vehicle detection dataset, termed DroneVehicle.
Our DroneVehicle collects 28, 439 RGB-Infrared image pairs, covering urban roads, residential areas, parking lots, and other scenarios from day to night.
arXiv Detail & Related papers (2020-03-05T05:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.