Related papers: Stable at Any Speed: Speed-Driven Multi-Object Tracking with Learnable Kalman Filtering

Stable at Any Speed: Speed-Driven Multi-Object Tracking with Learnable Kalman Filtering

URL: http://arxiv.org/abs/2508.00358v1
Date: Fri, 01 Aug 2025 06:42:33 GMT
Title: Stable at Any Speed: Speed-Driven Multi-Object Tracking with Learnable Kalman Filtering
Authors: Yan Gong, Mengjun Chen, Hao Liu, Gao Yongsheng, Lei Yang, Naibang Wang, Ziying Song, Haoqun Ma,
Abstract summary: Multi-object tracking (MOT) enables autonomous vehicles to continuously perceive dynamic objects.<n>Speed-Guided Learnable Kalman Filter (SG-LKF) adapts uncertainty to ego-vehicle speed, significantly improving stability and accuracy in highly dynamic scenarios.<n>SG-LKF ranks first among all vision-based methods on KITTI 2D MOT with 79.59% HOTA, delivers strong results on KITTI 3D MOT with 82.03% HOTA, and outperforms SimpleTrack by 2.2% AMOTA on nuScenes 3D MOT.
Score: 5.852380432257675
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-object tracking (MOT) enables autonomous vehicles to continuously perceive dynamic objects, supplying essential temporal cues for prediction, behavior understanding, and safe planning. However, conventional tracking-by-detection methods typically rely on static coordinate transformations based on ego-vehicle poses, disregarding ego-vehicle speed-induced variations in observation noise and reference frame changes, which degrades tracking stability and accuracy in dynamic, high-speed scenarios. In this paper, we investigate the critical role of ego-vehicle speed in MOT and propose a Speed-Guided Learnable Kalman Filter (SG-LKF) that dynamically adapts uncertainty modeling to ego-vehicle speed, significantly improving stability and accuracy in highly dynamic scenarios. Central to SG-LKF is MotionScaleNet (MSNet), a decoupled token-mixing and channel-mixing MLP that adaptively predicts key parameters of SG-LKF. To enhance inter-frame association and trajectory continuity, we introduce a self-supervised trajectory consistency loss jointly optimized with semantic and positional constraints. Extensive experiments show that SG-LKF ranks first among all vision-based methods on KITTI 2D MOT with 79.59% HOTA, delivers strong results on KITTI 3D MOT with 82.03% HOTA, and outperforms SimpleTrack by 2.2% AMOTA on nuScenes 3D MOT.

Related papers

Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos [58.156141601478794]
Multi-object tracking (UAVT) aims to track multiple objects while maintaining consistent identities across frames of a given video.<n>Existing methods typically model motion cues and appearance separately, overlooking their interplay and resulting in suboptimal tracking performance.<n>We propose AMOT, which exploits appearance and motion cues through two key components: an Appearance-Motion Consistency (AMC) matrix and a Motion-aware Track Continuation (MTC) module.
arXiv Detail & Related papers (2025-08-03T12:06:47Z)
Deep LG-Track: An Enhanced Localization-Confidence-Guided Multi-Object Tracker [13.846239755569552]
Deep LG-Track is a novel multi-object tracker that incorporates three key enhancements to improve the tracking accuracy and robustness.<n> Comprehensive evaluations on the MOT17 and MOT20 datasets demonstrate that the proposed Deep LG-Track consistently outperforms state-of-the-art trackers.
arXiv Detail & Related papers (2025-04-02T08:10:18Z)
HybridTrack: A Hybrid Approach for Robust Multi-Object Tracking [7.916733469603948]
HybridTrack is a novel 3D multi-object tracking approach for vehicles.<n>It integrates a data-driven Kalman Filter (KF) within a tracking-by-detection paradigm.<n>It achieves 82.72% HOTA accuracy, significantly outperforming state-of-the-art methods.
arXiv Detail & Related papers (2025-01-02T14:17:19Z)
MATE: Motion-Augmented Temporal Consistency for Event-based Point Tracking [58.719310295870024]
This paper presents an event-based framework for tracking any point.<n>To resolve ambiguities caused by event sparsity, a motion-guidance module incorporates kinematic vectors into the local matching process.<n>The method improves the $Survival_50$ metric by 17.9% over event-only tracking of any point baseline.
arXiv Detail & Related papers (2024-12-02T09:13:29Z)
MOT FCG++: Enhanced Representation of Spatio-temporal Motion and Appearance Features [0.0]
We propose a novel approach for appearance and spatial-temporal motion feature representation, improving upon the hierarchical clustering method MOT FCG. For spatialtemporal motion features, we first propose Diagonal Modulated GIoU, which more accurately represents the relationship between the position and shape of the objects. For appearance features, we utilize a dynamic appearance representation that incorporates confidence information, enabling the trajectory appearance features to be more robust and global.
arXiv Detail & Related papers (2024-11-15T08:17:05Z)
3D Multi-Object Tracking with Semi-Supervised GRU-Kalman Filter [6.13623925528906]
3D Multi-Object Tracking (MOT) is essential for intelligent systems like autonomous driving and robotic sensing. We propose a GRU-based MOT method, which introduces a learnable Kalman filter into the motion module. This approach is able to learn object motion characteristics through data-driven learning, thereby avoiding the need for manual model design and model error.
arXiv Detail & Related papers (2024-11-13T08:34:07Z)
RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud [11.111388829965103]
This paper addresses limitations in 3D tracking-by-detection methods, particularly in identifying legitimate trajectories.<n>Existing methods often use threshold-based filtering for detection scores, which can fail for distant and occluded objects.<n>We propose a novel track validity mechanism and multi-stage observational gating process, significantly reducing ghost tracks.
arXiv Detail & Related papers (2024-05-19T12:49:21Z)
MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor. Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z)
An Effective Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds [50.19288542498838]
3D single object tracking in LiDAR point clouds (LiDAR SOT) plays a crucial role in autonomous driving. Current approaches all follow the Siamese paradigm based on appearance matching. We introduce a motion-centric paradigm to handle LiDAR SOT from a new perspective.
arXiv Detail & Related papers (2023-03-21T17:28:44Z)
Motion Planning and Control for Multi Vehicle Autonomous Racing at High Speeds [100.61456258283245]
This paper presents a multi-layer motion planning and control architecture for autonomous racing. The proposed solution has been applied on a Dallara AV-21 racecar and tested at oval race tracks achieving lateral accelerations up to 25 $m/s2$.
arXiv Detail & Related papers (2022-07-22T15:16:54Z)
StreamYOLO: Real-time Object Detection for Streaming Perception [84.2559631820007]
We endow the models with the capacity of predicting the future, significantly improving the results for streaming perception. We consider multiple velocities driving scene and propose Velocity-awared streaming AP (VsAP) to jointly evaluate the accuracy. Our simple method achieves the state-of-the-art performance on Argoverse-HD dataset and improves the sAP and VsAP by 4.7% and 8.2% respectively.
arXiv Detail & Related papers (2022-07-21T12:03:02Z)
Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy [63.91005999481061]
A practical long-term tracker typically contains three key properties, i.e. an efficient model design, an effective global re-detection strategy and a robust distractor awareness mechanism. We propose a two-task tracking frame work (named DMTrack) to achieve distractor-aware fast tracking via Dynamic convolutions (d-convs) and Multiple object tracking (MOT) philosophy. Our tracker achieves state-of-the-art performance on the LaSOT, OxUvA, TLP, VOT2018LT and VOT 2019LT benchmarks and runs in real-time (3x faster
arXiv Detail & Related papers (2021-04-25T00:59:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.