Related papers: K-Track: Kalman-Enhanced Tracking for Accelerating Deep Point Trackers on Edge Devices

K-Track: Kalman-Enhanced Tracking for Accelerating Deep Point Trackers on Edge Devices

URL: http://arxiv.org/abs/2512.10628v1
Date: Thu, 11 Dec 2025 13:26:58 GMT
Title: K-Track: Kalman-Enhanced Tracking for Accelerating Deep Point Trackers on Edge Devices
Authors: Bishoy Galoaa, Pau Closas, Sarah Ostadabbas,
Abstract summary: Point tracking in video sequences is a capability for real-world computer vision applications, including robotics, autonomous systems, augmented reality, and video analysis.<n>While recent deep learning-based trackers achieve state-of-the-art accuracy on challenging benchmarks, their reliance on per-frame inference poses a major barrier to deployment on resource-constrained edge devices.<n>We introduce K-Track, a general-purpose, tracker-agnostic acceleration framework designed to bridge this deployment gap.
Score: 8.929138500431433
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Point tracking in video sequences is a foundational capability for real-world computer vision applications, including robotics, autonomous systems, augmented reality, and video analysis. While recent deep learning-based trackers achieve state-of-the-art accuracy on challenging benchmarks, their reliance on per-frame GPU inference poses a major barrier to deployment on resource-constrained edge devices, where compute, power, and connectivity are limited. We introduce K-Track (Kalman-enhanced Tracking), a general-purpose, tracker-agnostic acceleration framework designed to bridge this deployment gap. K-Track reduces inference cost by combining sparse deep learning keyframe updates with lightweight Kalman filtering for intermediate frame prediction, using principled Bayesian uncertainty propagation to maintain temporal coherence. This hybrid strategy enables 5-10X speedup while retaining over 85% of the original trackers' accuracy. We evaluate K-Track across multiple state-of-the-art point trackers and demonstrate real-time performance on edge platforms such as the NVIDIA Jetson Nano and RTX Titan. By preserving accuracy while dramatically lowering computational requirements, K-Track provides a practical path toward deploying high-quality point tracking in real-world, resource-limited settings, closing the gap between modern tracking algorithms and deployable vision systems.

Related papers

CoWTracker: Tracking by Warping instead of Correlation [53.834673070954494]
We propose a dense point tracker that eschews cost volumes in favor of warping.<n>Inspired by recent advances in optical flow, our approach iteratively refines track estimates by warping features from the target frame to the query frame based on the current estimate.<n>Our model is simple and achieves state-of-the-art performance on standard dense point tracking benchmarks, including TAP-Vid-DAVIS, TAP-Vid-Kinetics, and Robo-TAP.
arXiv Detail & Related papers (2026-02-04T18:58:59Z)
StableTrack: Stabilizing Multi-Object Tracking on Low-Frequency Detections [0.18054741274903915]
Multi-object tracking (MOT) is one of the most challenging tasks in computer vision.<n>Current approaches mainly focus on tracking objects in each frame of a video stream.<n>We propose StableTrack, a novel approach that stabilizes the quality of tracking on low-frequency detections.
arXiv Detail & Related papers (2025-11-25T15:42:33Z)
Track-On2: Enhancing Online Point Tracking with Memory [57.820749134569574]
We extend our prior model Track-On into Track-On2, a simple and efficient transformer-based model for online long-term tracking.<n>Track-On2 improves both performance and efficiency through architectural refinements, more effective use of memory, and improved synthetic training strategies.
arXiv Detail & Related papers (2025-09-23T15:00:18Z)
DELTAv2: Accelerating Dense 3D Tracking [79.63990337419514]
We propose a novel algorithm for accelerating dense long-term 3D point tracking in videos.<n>We introduce a coarse-to-fine strategy that begins tracking with a small subset of points and progressively expands the set of tracked trajectories.<n>The newly added trajectories are using a learnable module, which is trained end-to-end alongside the tracking network.
arXiv Detail & Related papers (2025-08-02T03:15:47Z)
LiteTracker: Leveraging Temporal Causality for Accurate Low-latency Tissue Tracking [86.67583223579851]
LiteTracker is a low-latency method for tissue tracking in endoscopic video streams.<n> LiteTracker builds on a state-of-the-art long-term point tracking method, and introduces a set of training-free runtime optimizations.
arXiv Detail & Related papers (2025-04-14T05:53:57Z)
Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.<n>DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.<n>Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z)
Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video. We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z)
CoTracker: It is Better to Track Together [70.63040730154984]
CoTracker is a transformer-based model that tracks a large number of 2D points in long video sequences. We show that joint tracking significantly improves tracking accuracy and robustness, and allows CoTracker to track occluded points and points outside of the camera view.
arXiv Detail & Related papers (2023-07-14T21:13:04Z)
PUCK: Parallel Surface and Convolution-kernel Tracking for Event-Based Cameras [4.110120522045467]
Event-cameras can guarantee fast visual sensing in dynamic environments, but require a tracking algorithm that can keep up with the high data rate induced by the robot ego-motion. We introduce a novel tracking method that leverages the Exponential Reduced Ordinal Surface (EROS) data representation to decouple event-by-event processing and tracking. We propose the task of tracking the air hockey puck sliding on a surface, with the future aim of controlling the iCub robot to reach the target precisely and on time.
arXiv Detail & Related papers (2022-05-16T13:23:52Z)
DeepScale: An Online Frame Size Adaptation Framework to Accelerate Visual Multi-object Tracking [8.878656943106934]
DeepScale is a model agnostic frame size selection approach to accelerate tracking throughput. It can find a suitable trade-off between tracking accuracy and speed by adapting frame sizes at run time. Compared to a state-of-the-art tracker, DeepScale++, a variant of DeepScale achieves 1.57X accelerated with only moderate degradation.
arXiv Detail & Related papers (2021-07-22T00:12:58Z)
Confidence Trigger Detection: Accelerating Real-time Tracking-by-detection Systems [1.6037469030022993]
Confidence-Triggered Detection (CTD) is an innovative approach that strategically bypasses object detection for frames closely resembling intermediate states. CTD not only enhances tracking speed but also preserves accuracy, surpassing existing tracking algorithms. Our experiments underscore the robustness and versatility of the CTD framework, demonstrating its potential to enable real-time tracking in resource-constrained environments.
arXiv Detail & Related papers (2019-02-02T01:52:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.