LasHeR: A Large-scale High-diversity Benchmark for RGBT Tracking
- URL: http://arxiv.org/abs/2104.13202v1
- Date: Tue, 27 Apr 2021 14:04:23 GMT
- Title: LasHeR: A Large-scale High-diversity Benchmark for RGBT Tracking
- Authors: Chenglong Li, Wanlin Xue, Yaqing Jia, Zhichen Qu, Bin Luo, and Jin
Tang
- Abstract summary: LasHeR consists of 1224 visible and thermal infrared video pairs with more than 730K frame pairs in total.
LasHeR is highly diverse capturing from a broad range of object categories, camera viewpoints, scene complexities and environmental factors.
We conduct a comprehensive performance evaluation of 12 RGBT tracking algorithms on the LasHeR dataset.
- Score: 27.00930976353204
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: RGBT tracking receives a surge of interest in the computer vision community,
but this research field lacks a large-scale and high-diversity benchmark
dataset, which is essential for both the training of deep RGBT trackers and the
comprehensive evaluation of RGBT tracking methods. To this end, we present a
Large-scale High-diversity benchmark for RGBT tracking (LasHeR) in this work.
LasHeR consists of 1224 visible and thermal infrared video pairs with more than
730K frame pairs in total. Each frame pair is spatially aligned and manually
annotated with a bounding box, making the dataset well and densely annotated.
LasHeR is highly diverse capturing from a broad range of object categories,
camera viewpoints, scene complexities and environmental factors across seasons,
weathers, day and night. We conduct a comprehensive performance evaluation of
12 RGBT tracking algorithms on the LasHeR dataset and present detailed analysis
to clarify the research room in RGBT tracking. In addition, we release the
unaligned version of LasHeR to attract the research interest for alignment-free
RGBT tracking, which is a more practical task in real-world applications. The
datasets and evaluation protocols are available at:
https://github.com/BUGPLEASEOUT/LasHeR.
Related papers
- Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [80.13652104204691]
In this paper, we construct a large-scale benchmark with high diversity for visible-thermal UAV tracking (VTUAV)
We provide a coarse-to-fine attribute annotation, where frame-level attributes are provided to exploit the potential of challenge-specific trackers.
In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.
arXiv Detail & Related papers (2022-04-08T15:22:33Z) - RGBD Object Tracking: An In-depth Review [89.96221353160831]
We firstly review RGBD object trackers from different perspectives, including RGBD fusion, depth usage, and tracking framework.
We benchmark a representative set of RGBD trackers, and give detailed analyses based on their performances.
arXiv Detail & Related papers (2022-03-26T18:53:51Z) - A Survey for Deep RGBT Tracking [0.0]
Visual object tracking with the visible (RGB) and thermal infrared (TIR) electromagnetic waves, shorted in RGBT tracking, recently draws increasing attention in the tracking community.
Considering the rapid development of deep learning, a survey for the recent deep neural network based RGBT trackers is presented.
arXiv Detail & Related papers (2022-01-23T15:52:26Z) - Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT
Benchmark for Crowd Counting [109.32927895352685]
We introduce a large-scale RGBT Crowd Counting (RGBT-CC) benchmark, which contains 2,030 pairs of RGB-thermal images with 138,389 annotated people.
To facilitate the multimodal crowd counting, we propose a cross-modal collaborative representation learning framework.
Experiments conducted on the RGBT-CC benchmark demonstrate the effectiveness of our framework for RGBT crowd counting.
arXiv Detail & Related papers (2020-12-08T16:18:29Z) - Multi-modal Visual Tracking: Review and Experimental Comparison [85.20414397784937]
We summarize the multi-modal tracking algorithms, especially visible-depth (RGB-D) tracking and visible-thermal (RGB-T) tracking.
We conduct experiments to analyze the effectiveness of trackers on five datasets.
arXiv Detail & Related papers (2020-12-08T02:39:38Z) - LSOTB-TIR:A Large-Scale High-Diversity Thermal Infrared Object Tracking
Benchmark [51.1506855334948]
This paper presents a Large-Scale and high-diversity general Thermal InfraRed (TIR) Object Tracking Benchmark, called LSOTBTIR.
We annotate the bounding box of objects in every frame of all sequences and generate over 730K bounding boxes in total.
We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance.
arXiv Detail & Related papers (2020-08-03T12:36:06Z) - RGBT Salient Object Detection: A Large-scale Dataset and Benchmark [12.14043884641457]
Taking advantage of RGB and thermal infrared images becomes a new research direction for detecting salient object in complex scenes.
This work contributes such a RGBT image dataset named VT5000, including 5000 spatially aligned RGBT image pairs with ground truth annotations.
We propose a powerful baseline approach, which extracts multi-level features within each modality and aggregates these features of all modalities with the attention mechanism.
arXiv Detail & Related papers (2020-07-07T07:58:14Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.