Multi-modal Visual Tracking: Review and Experimental Comparison
- URL: http://arxiv.org/abs/2012.04176v1
- Date: Tue, 8 Dec 2020 02:39:38 GMT
- Title: Multi-modal Visual Tracking: Review and Experimental Comparison
- Authors: Pengyu Zhang and Dong Wang and Huchuan Lu
- Abstract summary: We summarize the multi-modal tracking algorithms, especially visible-depth (RGB-D) tracking and visible-thermal (RGB-T) tracking.
We conduct experiments to analyze the effectiveness of trackers on five datasets.
- Score: 85.20414397784937
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual object tracking, as a fundamental task in computer vision, has drawn
much attention in recent years. To extend trackers to a wider range of
applications, researchers have introduced information from multiple modalities
to handle specific scenes, which is a promising research prospect with emerging
methods and benchmarks. To provide a thorough review of multi-modal track-ing,
we summarize the multi-modal tracking algorithms, especially visible-depth
(RGB-D) tracking and visible-thermal (RGB-T) tracking in a unified taxonomy
from different aspects. Second, we provide a detailed description of the
related benchmarks and challenges. Furthermore, we conduct extensive
experiments to analyze the effectiveness of trackers on five datasets: PTB,
VOT19-RGBD, GTOT, RGBT234, and VOT19-RGBT. Finally, we discuss various future
directions from different perspectives, including model design and dataset
construction for further research.
Related papers
- DIVOTrack: A Novel Dataset and Baseline Method for Cross-View
Multi-Object Tracking in DIVerse Open Scenes [74.64897845999677]
We introduce a new cross-view multi-object tracking dataset for DIVerse Open scenes with dense tracking pedestrians.
Our DIVOTrack has fifteen distinct scenarios and 953 cross-view tracks, surpassing all cross-view multi-object tracking datasets currently available.
Furthermore, we provide a novel baseline cross-view tracking method with a unified joint detection and cross-view tracking framework named CrossMOT.
arXiv Detail & Related papers (2023-02-15T14:10:42Z) - Prompting for Multi-Modal Tracking [70.0522146292258]
We propose a novel multi-modal prompt tracker (ProTrack) for multi-modal tracking.
ProTrack can transfer the multi-modal inputs to a single modality by the prompt paradigm.
Our ProTrack can achieve high-performance multi-modal tracking by only altering the inputs, even without any extra training on multi-modal data.
arXiv Detail & Related papers (2022-07-29T09:35:02Z) - Single Object Tracking Research: A Survey [44.24280758718638]
This paper presents the rationale and works of two most popular tracking frameworks in past ten years.
We present some deep learning based tracking methods categorized by different network structures.
We also introduce some classical strategies for handling the challenges in tracking problem.
arXiv Detail & Related papers (2022-04-25T02:59:15Z) - Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [80.13652104204691]
In this paper, we construct a large-scale benchmark with high diversity for visible-thermal UAV tracking (VTUAV)
We provide a coarse-to-fine attribute annotation, where frame-level attributes are provided to exploit the potential of challenge-specific trackers.
In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.
arXiv Detail & Related papers (2022-04-08T15:22:33Z) - Visual Object Tracking with Discriminative Filters and Siamese Networks:
A Survey and Outlook [97.27199633649991]
Discriminative Correlation Filters (DCFs) and deep Siamese Networks (SNs) have emerged as dominating tracking paradigms.
This survey presents a systematic and thorough review of more than 90 DCFs and Siamese trackers, based on results in nine tracking benchmarks.
arXiv Detail & Related papers (2021-12-06T07:57:10Z) - Revisiting the details when evaluating a visual tracker [0.0]
This report focuses on single object tracking and revisits the details of tracker evaluation based on widely used OTBciteotb benchmark.
Experimental results suggest that there may not be an absolute winner among tracking algorithms.
arXiv Detail & Related papers (2021-01-25T13:43:27Z) - Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous
Driving [22.693895321632507]
We propose a probabilistic, multi-modal, multi-object tracking system consisting of different trainable modules.
We show that our method outperforms current state-of-the-art on the NuScenes Tracking dataset.
arXiv Detail & Related papers (2020-12-26T15:00:54Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.