How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline
- URL: http://arxiv.org/abs/2512.07385v1
- Date: Mon, 08 Dec 2025 10:19:54 GMT
- Title: How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline
- Authors: Chunhui Zhang, Li Liu, Zhipeng Zhang, Yong Wang, Hao Wen, Xi Zhou, Shiming Ge, Yanfeng Wang,
- Abstract summary: Unmanned Aerial Vehicles (UAVs) offer wide-ranging applications but also pose significant safety and privacy violation risks.<n>Current Anti-UAV research primarily focuses on RGB, infrared (IR), or RGB-IR videos captured by fixed ground cameras.<n>We propose a new multi-modal visual tracking task termed UAV-Anti-UAV, which involves a pursuer UAV tracking a target adversarial UAV in the video stream.
- Score: 74.4054700050366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unmanned Aerial Vehicles (UAVs) offer wide-ranging applications but also pose significant safety and privacy violation risks in areas like airport and infrastructure inspection, spurring the rapid development of Anti-UAV technologies in recent years. However, current Anti-UAV research primarily focuses on RGB, infrared (IR), or RGB-IR videos captured by fixed ground cameras, with little attention to tracking target UAVs from another moving UAV platform. To fill this gap, we propose a new multi-modal visual tracking task termed UAV-Anti-UAV, which involves a pursuer UAV tracking a target adversarial UAV in the video stream. Compared to existing Anti-UAV tasks, UAV-Anti-UAV is more challenging due to severe dual-dynamic disturbances caused by the rapid motion of both the capturing platform and the target. To advance research in this domain, we construct a million-scale dataset consisting of 1,810 videos, each manually annotated with bounding boxes, a language prompt, and 15 tracking attributes. Furthermore, we propose MambaSTS, a Mamba-based baseline method for UAV-Anti-UAV tracking, which enables integrated spatial-temporal-semantic learning. Specifically, we employ Mamba and Transformer models to learn global semantic and spatial features, respectively, and leverage the state space model's strength in long-sequence modeling to establish video-level long-term context via a temporal token propagation mechanism. We conduct experiments on the UAV-Anti-UAV dataset to validate the effectiveness of our method. A thorough experimental evaluation of 50 modern deep tracking algorithms demonstrates that there is still significant room for improvement in the UAV-Anti-UAV domain. The dataset and codes will be available at {\color{magenta}https://github.com/983632847/Awesome-Multimodal-Object-Tracking}.
Related papers
- CST Anti-UAV: A Thermal Infrared Benchmark for Tiny UAV Tracking in Complex Scenes [35.983551600618476]
We present a new thermal infrared dataset specifically designed for Single Object Tracking (SOT) in Complex Scenes with Tiny UAVs (CST)<n>It contains 220 video sequences with over 240k high-quality bounding box annotations, highlighting two key properties: a significant number of tiny-sized UAV targets and the diverse and complex scenes.<n>CST Anti-UAV is the first dataset to incorporate complete manual frame-level attribute annotations, enabling precise evaluations under varied challenges.
arXiv Detail & Related papers (2025-07-31T11:53:21Z) - UAVDB: Point-Guided Masks for UAV Detection and Segmentation [0.03464344220266879]
We present UAVDB, a new benchmark dataset for UAV detection and segmentation.<n>It is built upon a point-guided weak supervision pipeline.<n>UAVDB captures UAVs at diverse scales, from visible objects to near-single-pixel instances.
arXiv Detail & Related papers (2024-09-09T13:27:53Z) - Tiny Multi-Agent DRL for Twins Migration in UAV Metaverses: A Multi-Leader Multi-Follower Stackelberg Game Approach [57.15309977293297]
The synergy between Unmanned Aerial Vehicles (UAVs) and metaverses is giving rise to an emerging paradigm named UAV metaverses.
We propose a tiny machine learning-based Stackelberg game framework based on pruning techniques for efficient UT migration in UAV metaverses.
arXiv Detail & Related papers (2024-01-18T02:14:13Z) - Evidential Detection and Tracking Collaboration: New Problem, Benchmark
and Algorithm for Robust Anti-UAV System [56.51247807483176]
Unmanned Aerial Vehicles (UAVs) have been widely used in many areas, including transportation, surveillance, and military.
Previous works have simplified such an anti-UAV task as a tracking problem, where prior information of UAVs is always provided.
In this paper, we first formulate a new and practical anti-UAV problem featuring the UAVs perception in complex scenes without prior UAVs information.
arXiv Detail & Related papers (2023-06-27T19:30:23Z) - Learning to Compress Unmanned Aerial Vehicle (UAV) Captured Video:
Benchmark and Analysis [54.07535860237662]
We propose a novel task for learned UAV video coding and construct a comprehensive and systematic benchmark for such a task.
It is expected that the benchmark will accelerate the research and development in video coding on drone platforms.
arXiv Detail & Related papers (2023-01-15T15:18:02Z) - Vision-based Anti-UAV Detection and Tracking [18.307952561941942]
Unmanned aerial vehicles (UAV) have been widely used in various fields, and their invasion of security and privacy has aroused social concern.
We propose a visible light mode dataset called Dalian University of Technology Anti-UAV dataset, DUT Anti-UAV.
It contains a detection dataset with a total of 10,000 images and a tracking dataset with 20 videos that include short-term and long-term sequences.
arXiv Detail & Related papers (2022-05-22T15:21:45Z) - Attention-based Reinforcement Learning for Real-Time UAV Semantic
Communication [53.46235596543596]
We study the problem of air-to-ground ultra-reliable and low-latency communication (URLLC) for a moving ground user.
We propose a novel multi-agent deep reinforcement learning framework, coined a graph attention exchange network (GAXNet)
GAXNet achieves 6.5x lower latency with the target 0.0000001 error rate, compared to a state-of-the-art baseline framework.
arXiv Detail & Related papers (2021-05-22T12:43:25Z) - UAV-ReID: A Benchmark on Unmanned Aerial Vehicle Re-identification [21.48667873335246]
Recent development in deep learning allows vision-based counter-UAV systems to detect and track UAVs with a single camera.
The coverage of a single camera is limited, necessitating the need for multicamera configurations to match UAVs across cameras.
We propose the first new UAV re-identification data set, UAV-reID, that facilitates the development of machine learning solutions in this emerging area.
arXiv Detail & Related papers (2021-04-13T14:13:09Z) - Anti-UAV: A Large Multi-Modal Benchmark for UAV Tracking [59.06167734555191]
Unmanned Aerial Vehicle (UAV) offers lots of applications in both commerce and recreation.
We consider the task of tracking UAVs, providing rich information such as location and trajectory.
We propose a dataset, Anti-UAV, with more than 300 video pairs containing over 580k manually annotated bounding boxes.
arXiv Detail & Related papers (2021-01-21T07:00:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.