SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms
- URL: http://arxiv.org/abs/2512.04883v1
- Date: Thu, 04 Dec 2025 15:11:43 GMT
- Title: SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms
- Authors: Jiawen Wen, Yu Hu, Suixuan Qiu, Jinshan Huang, Xiaowen Chu,
- Abstract summary: Real-time tracking of small unmanned aerial vehicles (UAVs) on edge devices faces a fundamental resolution-speed conflict.<n>We propose a Sparse Detection-Guided Tracker that adopts an Observer-Follower architecture to reconcile this conflict.<n>Experiments on a ground-to-air tracking station demonstrate that SDG-Track achieves 35.1 FPS system throughput while retaining 97.2% of the frame-by-frame detection precision.
- Score: 11.029096488950414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time tracking of small unmanned aerial vehicles (UAVs) on edge devices faces a fundamental resolution-speed conflict. Downsampling high-resolution imagery to standard detector input sizes causes small target features to collapse below detectable thresholds. Yet processing native 1080p frames on resource-constrained platforms yields insufficient throughput for smooth gimbal control. We propose SDG-Track, a Sparse Detection-Guided Tracker that adopts an Observer-Follower architecture to reconcile this conflict. The Observer stream runs a high-capacity detector at low frequency on the GPU to provide accurate position anchors from 1920x1080 frames. The Follower stream performs high-frequency trajectory interpolation via ROI-constrained sparse optical flow on the CPU. To handle tracking failures from occlusion or model drift caused by spectrally similar distractors, we introduce Dual-Space Recovery, a training-free re-acquisition mechanism combining color histogram matching with geometric consistency constraints. Experiments on a ground-to-air tracking station demonstrate that SDG-Track achieves 35.1 FPS system throughput while retaining 97.2\% of the frame-by-frame detection precision. The system successfully tracks agile FPV drones under real-world operational conditions on an NVIDIA Jetson Orin Nano. Our paper code is publicly available at https://github.com/Jeffry-wen/SDG-Track
Related papers
- Architecture and evaluation protocol for transformer-based visual object tracking in UAV applications [0.0]
Existing visual trackers either lack robustness in complex scenarios or are too computationally demanding for real-time embedded use.<n>We propose an Modular Asynchronous Tracking Architecture (MATA) that combines a transformer-based tracker with an Extended Kalman Filter.<n>We introduce a hardware-independent, embedded oriented evaluation protocol and a new metric called Normalized time to Failure (NT2F) to quantify how long a tracker can sustain a tracking sequence without external help.
arXiv Detail & Related papers (2026-03-04T10:12:12Z) - K-Track: Kalman-Enhanced Tracking for Accelerating Deep Point Trackers on Edge Devices [8.929138500431433]
Point tracking in video sequences is a capability for real-world computer vision applications, including robotics, autonomous systems, augmented reality, and video analysis.<n>While recent deep learning-based trackers achieve state-of-the-art accuracy on challenging benchmarks, their reliance on per-frame inference poses a major barrier to deployment on resource-constrained edge devices.<n>We introduce K-Track, a general-purpose, tracker-agnostic acceleration framework designed to bridge this deployment gap.
arXiv Detail & Related papers (2025-12-11T13:26:58Z) - StableTrack: Stabilizing Multi-Object Tracking on Low-Frequency Detections [0.18054741274903915]
Multi-object tracking (MOT) is one of the most challenging tasks in computer vision.<n>Current approaches mainly focus on tracking objects in each frame of a video stream.<n>We propose StableTrack, a novel approach that stabilizes the quality of tracking on low-frequency detections.
arXiv Detail & Related papers (2025-11-25T15:42:33Z) - CADTrack: Learning Contextual Aggregation with Deformable Alignment for Robust RGBT Tracking [68.71826342377004]
RGB-Thermal (RGBT) tracking aims to exploit visible and thermal infrared modalities for robust all-weather object tracking.<n>Existing RGBT trackers struggle to resolve modality discrepancies, which poses great challenges for robust feature representation.<n>We propose a novel Contextual Aggregation with Deformable Alignment framework called CADTrack for RGBT Tracking.
arXiv Detail & Related papers (2025-11-22T08:10:02Z) - SwiTrack: Tri-State Switch for Cross-Modal Object Tracking [74.15663758681849]
Cross-modal object tracking (CMOT) is an emerging task that maintains target consistency while the video stream switches between different modalities.<n>We propose SwiTrack, a novel state-switching framework that redefines CMOT through the deployment of three specialized streams.
arXiv Detail & Related papers (2025-11-20T10:52:54Z) - NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z) - Multiple Object Tracking in Video SAR: A Benchmark and Tracking Baseline [6.467005601813546]
Video synthetic aperture radar (Video SAR) is used for multi-object tracking.<n>Doppler shifts induced by target motion result in artifacts that are easily mistaken for shadows.<n>A major limitation in this field is the lack of public benchmark datasets for standardized algorithm evaluation.
arXiv Detail & Related papers (2025-06-13T06:12:25Z) - ODTFormer: Efficient Obstacle Detection and Tracking with Stereo Cameras Based on Transformer [12.58804521609764]
ODTFormer is a Transformer-based model to address both obstacle detection and tracking problems.
We report comparable accuracy to state-of-the-art obstacle tracking models while requiring only a fraction of their cost.
arXiv Detail & Related papers (2024-03-21T17:59:55Z) - CRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event
Cameras [43.699819213559515]
Existing datasets for RGB-DVS tracking are collected with DVS346 camera and their resolution ($346 times 260$) is low for practical applications.
We build the first unaligned frame-event dataset CRSOT collected with a specially built data acquisition system.
We propose a novel unaligned object tracking framework that can realize robust tracking even using the loosely aligned RGB-Event data.
arXiv Detail & Related papers (2024-01-05T14:20:22Z) - Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - TransVisDrone: Spatio-Temporal Transformer for Vision-based
Drone-to-Drone Detection in Aerial Videos [57.92385818430939]
Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones.
Existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices.
We propose a simple yet effective framework, itTransVisDrone, that provides an end-to-end solution with higher computational efficiency.
arXiv Detail & Related papers (2022-10-16T03:05:13Z) - FOVEA: Foveated Image Magnification for Autonomous Navigation [53.69803081925454]
We propose an attentional approach that elastically magnifies certain regions while maintaining a small input canvas.
Our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
On the autonomous driving datasets Argoverse-HD and BDD100K, we show our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
arXiv Detail & Related papers (2021-08-27T03:07:55Z) - DroTrack: High-speed Drone-based Object Tracking Under Uncertainty [0.23204178451683263]
DroTrack is a high-speed visual single-object tracking framework for drone-captured video sequences.
We implement an effective object segmentation based on Fuzzy C Means.
We also leverage the geometrical angular motion to estimate a reliable object scale.
arXiv Detail & Related papers (2020-05-02T13:16:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.