Related papers: SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms

SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms

URL: http://arxiv.org/abs/2512.04883v1
Date: Thu, 04 Dec 2025 15:11:43 GMT
Title: SDG-Track: A Heterogeneous Observer-Follower Framework for High-Resolution UAV Tracking on Embedded Platforms
Authors: Jiawen Wen, Yu Hu, Suixuan Qiu, Jinshan Huang, Xiaowen Chu,
Abstract summary: Real-time tracking of small unmanned aerial vehicles (UAVs) on edge devices faces a fundamental resolution-speed conflict.<n>We propose a Sparse Detection-Guided Tracker that adopts an Observer-Follower architecture to reconcile this conflict.<n>Experiments on a ground-to-air tracking station demonstrate that SDG-Track achieves 35.1 FPS system throughput while retaining 97.2% of the frame-by-frame detection precision.
Score: 11.029096488950414
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-time tracking of small unmanned aerial vehicles (UAVs) on edge devices faces a fundamental resolution-speed conflict. Downsampling high-resolution imagery to standard detector input sizes causes small target features to collapse below detectable thresholds. Yet processing native 1080p frames on resource-constrained platforms yields insufficient throughput for smooth gimbal control. We propose SDG-Track, a Sparse Detection-Guided Tracker that adopts an Observer-Follower architecture to reconcile this conflict. The Observer stream runs a high-capacity detector at low frequency on the GPU to provide accurate position anchors from 1920x1080 frames. The Follower stream performs high-frequency trajectory interpolation via ROI-constrained sparse optical flow on the CPU. To handle tracking failures from occlusion or model drift caused by spectrally similar distractors, we introduce Dual-Space Recovery, a training-free re-acquisition mechanism combining color histogram matching with geometric consistency constraints. Experiments on a ground-to-air tracking station demonstrate that SDG-Track achieves 35.1 FPS system throughput while retaining 97.2\% of the frame-by-frame detection precision. The system successfully tracks agile FPV drones under real-world operational conditions on an NVIDIA Jetson Orin Nano. Our paper code is publicly available at https://github.com/Jeffry-wen/SDG-Track

Related papers

Architecture and evaluation protocol for transformer-based visual object tracking in UAV applications [0.0]
Existing visual trackers either lack robustness in complex scenarios or are too computationally demanding for real-time embedded use.<n>We propose an Modular Asynchronous Tracking Architecture (MATA) that combines a transformer-based tracker with an Extended Kalman Filter.<n>We introduce a hardware-independent, embedded oriented evaluation protocol and a new metric called Normalized time to Failure (NT2F) to quantify how long a tracker can sustain a tracking sequence without external help.
arXiv Detail & Related papers (2026-03-04T10:12:12Z)
K-Track: Kalman-Enhanced Tracking for Accelerating Deep Point Trackers on Edge Devices [8.929138500431433]
Point tracking in video sequences is a capability for real-world computer vision applications, including robotics, autonomous systems, augmented reality, and video analysis.<n>While recent deep learning-based trackers achieve state-of-the-art accuracy on challenging benchmarks, their reliance on per-frame inference poses a major barrier to deployment on resource-constrained edge devices.<n>We introduce K-Track, a general-purpose, tracker-agnostic acceleration framework designed to bridge this deployment gap.
arXiv Detail & Related papers (2025-12-11T13:26:58Z)
StableTrack: Stabilizing Multi-Object Tracking on Low-Frequency Detections [0.18054741274903915]
Multi-object tracking (MOT) is one of the most challenging tasks in computer vision.<n>Current approaches mainly focus on tracking objects in each frame of a video stream.<n>We propose StableTrack, a novel approach that stabilizes the quality of tracking on low-frequency detections.
arXiv Detail & Related papers (2025-11-25T15:42:33Z)
CADTrack: Learning Contextual Aggregation with Deformable Alignment for Robust RGBT Tracking [68.71826342377004]
RGB-Thermal (RGBT) tracking aims to exploit visible and thermal infrared modalities for robust all-weather object tracking.<n>Existing RGBT trackers struggle to resolve modality discrepancies, which poses great challenges for robust feature representation.<n>We propose a novel Contextual Aggregation with Deformable Alignment framework called CADTrack for RGBT Tracking.
arXiv Detail & Related papers (2025-11-22T08:10:02Z)
SwiTrack: Tri-State Switch for Cross-Modal Object Tracking [74.15663758681849]
Cross-modal object tracking (CMOT) is an emerging task that maintains target consistency while the video stream switches between different modalities.<n>We propose SwiTrack, a novel state-switching framework that redefines CMOT through the deployment of three specialized streams.
arXiv Detail & Related papers (2025-11-20T10:52:54Z)
NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z)
Multiple Object Tracking in Video SAR: A Benchmark and Tracking Baseline [6.467005601813546]
Video synthetic aperture radar (Video SAR) is used for multi-object tracking.<n>Doppler shifts induced by target motion result in artifacts that are easily mistaken for shadows.<n>A major limitation in this field is the lack of public benchmark datasets for standardized algorithm evaluation.
arXiv Detail & Related papers (2025-06-13T06:12:25Z)
ODTFormer: Efficient Obstacle Detection and Tracking with Stereo Cameras Based on Transformer [12.58804521609764]
ODTFormer is a Transformer-based model to address both obstacle detection and tracking problems. We report comparable accuracy to state-of-the-art obstacle tracking models while requiring only a fraction of their cost.
arXiv Detail & Related papers (2024-03-21T17:59:55Z)
CRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event Cameras [43.699819213559515]
Existing datasets for RGB-DVS tracking are collected with DVS346 camera and their resolution ($346 times 260$) is low for practical applications. We build the first unaligned frame-event dataset CRSOT collected with a specially built data acquisition system. We propose a novel unaligned object tracking framework that can realize robust tracking even using the loosely aligned RGB-Event data.
arXiv Detail & Related papers (2024-01-05T14:20:22Z)
Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure. First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module. The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z)
TransVisDrone: Spatio-Temporal Transformer for Vision-based Drone-to-Drone Detection in Aerial Videos [57.92385818430939]
Drone-to-drone detection using visual feed has crucial applications, such as detecting drone collisions, detecting drone attacks, or coordinating flight with other drones. Existing methods are computationally costly, follow non-end-to-end optimization, and have complex multi-stage pipelines, making them less suitable for real-time deployment on edge devices. We propose a simple yet effective framework, itTransVisDrone, that provides an end-to-end solution with higher computational efficiency.
arXiv Detail & Related papers (2022-10-16T03:05:13Z)
FOVEA: Foveated Image Magnification for Autonomous Navigation [53.69803081925454]
We propose an attentional approach that elastically magnifies certain regions while maintaining a small input canvas. Our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning. On the autonomous driving datasets Argoverse-HD and BDD100K, we show our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
arXiv Detail & Related papers (2021-08-27T03:07:55Z)
DroTrack: High-speed Drone-based Object Tracking Under Uncertainty [0.23204178451683263]
DroTrack is a high-speed visual single-object tracking framework for drone-captured video sequences. We implement an effective object segmentation based on Fuzzy C Means. We also leverage the geometrical angular motion to estimate a reliable object scale.
arXiv Detail & Related papers (2020-05-02T13:16:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.