MSITrack: A Challenging Benchmark for Multispectral Single Object Tracking
- URL: http://arxiv.org/abs/2510.06619v1
- Date: Wed, 08 Oct 2025 03:56:36 GMT
- Title: MSITrack: A Challenging Benchmark for Multispectral Single Object Tracking
- Authors: Tao Feng, Tingfa Xu, Haolin Qin, Tianhao Li, Shuaihao Han, Xuyang Zou, Zhan Lv, Jianan Li,
- Abstract summary: MSITrack is the largest and most diverse multispectral single object tracking dataset to date.<n>It includes 55 object categories and 300 distinct natural scenes.<n> MSITrack significantly improves performance over RGB-only baselines.
- Score: 28.794206918661818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual object tracking in real-world scenarios presents numerous challenges including occlusion, interference from similar objects and complex backgrounds-all of which limit the effectiveness of RGB-based trackers. Multispectral imagery, which captures pixel-level spectral reflectance, enhances target discriminability. However, the availability of multispectral tracking datasets remains limited. To bridge this gap, we introduce MSITrack, the largest and most diverse multispectral single object tracking dataset to date. MSITrack offers the following key features: (i) More Challenging Attributes-including interference from similar objects and similarity in color and texture between targets and backgrounds in natural scenarios, along with a wide range of real-world tracking challenges; (ii) Richer and More Natural Scenes-spanning 55 object categories and 300 distinct natural scenes, MSITrack far exceeds the scope of existing benchmarks. Many of these scenes and categories are introduced to the multispectral tracking domain for the first time; (iii) Larger Scale-300 videos comprising over 129k frames of multispectral imagery. To ensure annotation precision, each frame has undergone meticulous processing, manual labeling and multi-stage verification. Extensive evaluations using representative trackers demonstrate that the multispectral data in MSITrack significantly improves performance over RGB-only baselines, highlighting its potential to drive future advancements in the field. The MSITrack dataset is publicly available at: https://github.com/Fengtao191/MSITrack.
Related papers
- MMOT: The First Challenging Benchmark for Drone-based Multispectral Multi-Object Tracking [30.3437683353074]
MMOT is the first benchmark for drone-based multispectral multi-object tracking.<n>It features 125 video sequences with over 488.8K annotations across eight categories.<n>To better extract spectral features and leverage oriented annotations, we present a multispectral and orientation-aware MOT scheme.
arXiv Detail & Related papers (2025-10-14T14:25:17Z) - MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking [17.96400810834486]
We introduce the first large-scale Multispectral UAV Single Object Tracking dataset (MUST)<n>MUST includes 250 video sequences spanning diverse environments and challenges.<n>We also propose a novel tracking framework, UNTrack, which encodes unified spectral, spatial, and temporal features from spectrum prompts.
arXiv Detail & Related papers (2025-03-22T08:47:28Z) - Heterogeneous Graph Transformer for Multiple Tiny Object Tracking in RGB-T Videos [31.910202172609313]
Existing multi-object tracking algorithms generally focus on single-modality scenes.<n>We propose a novel framework called HGT-Track (Heterogeneous Graph Transformer based Multi-Tiny-Object Tracking)<n>This paper introduces the first benchmark VT-Tiny-MOT (Visible-Thermal Tiny Multi-Object Tracking) for RGB-T fused multiple tiny object tracking.
arXiv Detail & Related papers (2024-12-14T15:17:49Z) - MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous Driving [10.399817864597347]
This paper introduces MCTrack, a new 3D multi-object tracking method that achieves state-of-the-art (SOTA) performance across KITTI, nuScenes, and datasets.
arXiv Detail & Related papers (2024-09-23T11:26:01Z) - MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark [63.878793340338035]
Multi-target multi-camera tracking is a crucial task that involves identifying and tracking individuals over time using video streams from multiple cameras.
Existing datasets for this task are either synthetically generated or artificially constructed within a controlled camera network setting.
We present MTMMC, a real-world, large-scale dataset that includes long video sequences captured by 16 multi-modal cameras in two different environments.
arXiv Detail & Related papers (2024-03-29T15:08:37Z) - Long-Term Visual Object Tracking with Event Cameras: An Associative Memory Augmented Tracker and A Benchmark Dataset [9.366068518600583]
We propose a new long-term, large-scale frame-event visual object tracking dataset, termed FELT.<n>We also propose a novel Associative Memory Transformer based RGB-Event long-term visual tracker, termed AMTTrack.
arXiv Detail & Related papers (2024-03-09T08:49:50Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - DIVOTrack: A Novel Dataset and Baseline Method for Cross-View
Multi-Object Tracking in DIVerse Open Scenes [74.64897845999677]
We introduce a new cross-view multi-object tracking dataset for DIVerse Open scenes with dense tracking pedestrians.
Our DIVOTrack has fifteen distinct scenarios and 953 cross-view tracks, surpassing all cross-view multi-object tracking datasets currently available.
Furthermore, we provide a novel baseline cross-view tracking method with a unified joint detection and cross-view tracking framework named CrossMOT.
arXiv Detail & Related papers (2023-02-15T14:10:42Z) - Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [80.13652104204691]
In this paper, we construct a large-scale benchmark with high diversity for visible-thermal UAV tracking (VTUAV)
We provide a coarse-to-fine attribute annotation, where frame-level attributes are provided to exploit the potential of challenge-specific trackers.
In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.
arXiv Detail & Related papers (2022-04-08T15:22:33Z) - Multi-modal Visual Tracking: Review and Experimental Comparison [85.20414397784937]
We summarize the multi-modal tracking algorithms, especially visible-depth (RGB-D) tracking and visible-thermal (RGB-T) tracking.
We conduct experiments to analyze the effectiveness of trackers on five datasets.
arXiv Detail & Related papers (2020-12-08T02:39:38Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.