UAUTrack: Towards Unified Multimodal Anti-UAV Visual Tracking
- URL: http://arxiv.org/abs/2512.02668v1
- Date: Tue, 02 Dec 2025 11:47:13 GMT
- Title: UAUTrack: Towards Unified Multimodal Anti-UAV Visual Tracking
- Authors: Qionglin Ren, Dawei Zhang, Chunxu Tian, Dan Zhang,
- Abstract summary: UAUTrack is a unified single-target tracking framework built upon a single-stream, single-stage, end-to-end architecture.<n>Results show that UAUTrack achieves state-of-the-art performance on the Anti-UAV and DUT Anti-UAV datasets.
- Score: 4.161228439649909
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Research in Anti-UAV (Unmanned Aerial Vehicle) tracking has explored various modalities, including RGB, TIR, and RGB-T fusion. However, a unified framework for cross-modal collaboration is still lacking. Existing approaches have primarily focused on independent models for individual tasks, often overlooking the potential for cross-modal information sharing. Furthermore, Anti-UAV tracking techniques are still in their infancy, with current solutions struggling to achieve effective multimodal data fusion. To address these challenges, we propose UAUTrack, a unified single-target tracking framework built upon a single-stream, single-stage, end-to-end architecture that effectively integrates multiple modalities. UAUTrack introduces a key component: a text prior prompt strategy that directs the model to focus on UAVs across various scenarios. Experimental results show that UAUTrack achieves state-of-the-art performance on the Anti-UAV and DUT Anti-UAV datasets, and maintains a favourable trade-off between accuracy and speed on the Anti-UAV410 dataset, demonstrating both high accuracy and practical efficiency across diverse Anti-UAV scenarios.
Related papers
- How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline [74.4054700050366]
Unmanned Aerial Vehicles (UAVs) offer wide-ranging applications but also pose significant safety and privacy violation risks.<n>Current Anti-UAV research primarily focuses on RGB, infrared (IR), or RGB-IR videos captured by fixed ground cameras.<n>We propose a new multi-modal visual tracking task termed UAV-Anti-UAV, which involves a pursuer UAV tracking a target adversarial UAV in the video stream.
arXiv Detail & Related papers (2025-12-08T10:19:54Z) - AerialMind: Towards Referring Multi-Object Tracking in UAV Scenarios [64.51320327698231]
We introduce AerialMind, the first large-scale RMOT benchmark in UAV scenarios.<n>We develop an innovative semi-automated collaborative agent-based labeling assistant framework.<n>We also propose HawkEyeTrack, a novel method that collaboratively enhances vision-language representation learning.
arXiv Detail & Related papers (2025-11-26T04:44:27Z) - A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles [74.8162337823142]
MM-UAV is the first large-scale benchmark for Multi-Modal UAV Tracking.<n>The dataset spans over 30 challenging scenarios, with 1,321 synchronised multi-modal sequences, and more than 2.8 million annotated frames.<n>Accompanying the dataset, we provide a novel multi-modal multi-UAV tracking framework.
arXiv Detail & Related papers (2025-11-23T08:42:17Z) - Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos [58.156141601478794]
Multi-object tracking (UAVT) aims to track multiple objects while maintaining consistent identities across frames of a given video.<n>Existing methods typically model motion cues and appearance separately, overlooking their interplay and resulting in suboptimal tracking performance.<n>We propose AMOT, which exploits appearance and motion cues through two key components: an Appearance-Motion Consistency (AMC) matrix and a Motion-aware Track Continuation (MTC) module.
arXiv Detail & Related papers (2025-08-03T12:06:47Z) - Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReID [0.03464344220266879]
Multi-UAV tracking in thermal infrared video is challenging due to low contrast, environmental noise, and small target sizes.<n>We present a tracking framework built on YOLOv12 and BoT-SORT, enhanced with tailored training and inference strategies.<n>We provide implementation details, in-depth experimental analysis, and a discussion of potential improvements.
arXiv Detail & Related papers (2025-03-21T15:40:18Z) - SFTrack: A Robust Scale and Motion Adaptive Algorithm for Tracking Small and Fast Moving Objects [2.9803250365852443]
This paper addresses the problem of multi-object tracking in Unmanned Aerial Vehicle (UAV) footage.
It plays a critical role in various UAV applications, including traffic monitoring systems and real-time suspect tracking by the police.
We propose a new tracking strategy, which initiates the tracking of target objects from low-confidence detections.
arXiv Detail & Related papers (2024-10-26T05:09:20Z) - UAVDB: Point-Guided Masks for UAV Detection and Segmentation [0.03464344220266879]
We present UAVDB, a new benchmark dataset for UAV detection and segmentation.<n>It is built upon a point-guided weak supervision pipeline.<n>UAVDB captures UAVs at diverse scales, from visible objects to near-single-pixel instances.
arXiv Detail & Related papers (2024-09-09T13:27:53Z) - Evidential Detection and Tracking Collaboration: New Problem, Benchmark
and Algorithm for Robust Anti-UAV System [56.51247807483176]
Unmanned Aerial Vehicles (UAVs) have been widely used in many areas, including transportation, surveillance, and military.
Previous works have simplified such an anti-UAV task as a tracking problem, where prior information of UAVs is always provided.
In this paper, we first formulate a new and practical anti-UAV problem featuring the UAVs perception in complex scenes without prior UAVs information.
arXiv Detail & Related papers (2023-06-27T19:30:23Z) - Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking [59.06167734555191]
Unmanned Aerial Vehicle (UAV) offers lots of applications in both commerce and recreation.
We consider the task of tracking UAVs, providing rich information such as location and trajectory.
We propose a dataset, Anti-UAV, with more than 300 video pairs containing over 580k manually annotated bounding boxes.
arXiv Detail & Related papers (2021-01-21T07:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.