DELTAv2: Accelerating Dense 3D Tracking
- URL: http://arxiv.org/abs/2508.01170v1
- Date: Sat, 02 Aug 2025 03:15:47 GMT
- Title: DELTAv2: Accelerating Dense 3D Tracking
- Authors: Tuan Duc Ngo, Ashkan Mirzaei, Guocheng Qian, Hanwen Liang, Chuang Gan, Evangelos Kalogerakis, Peter Wonka, Chaoyang Wang,
- Abstract summary: We propose a novel algorithm for accelerating dense long-term 3D point tracking in videos.<n>We introduce a coarse-to-fine strategy that begins tracking with a small subset of points and progressively expands the set of tracked trajectories.<n>The newly added trajectories are using a learnable module, which is trained end-to-end alongside the tracking network.
- Score: 79.63990337419514
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel algorithm for accelerating dense long-term 3D point tracking in videos. Through analysis of existing state-of-the-art methods, we identify two major computational bottlenecks. First, transformer-based iterative tracking becomes expensive when handling a large number of trajectories. To address this, we introduce a coarse-to-fine strategy that begins tracking with a small subset of points and progressively expands the set of tracked trajectories. The newly added trajectories are initialized using a learnable interpolation module, which is trained end-to-end alongside the tracking network. Second, we propose an optimization that significantly reduces the cost of correlation feature computation, another key bottleneck in prior methods. Together, these improvements lead to a 5-100x speedup over existing approaches while maintaining state-of-the-art tracking accuracy.
Related papers
- DELTA: Dense Efficient Long-range 3D Tracking for any video [82.26753323263009]
We introduce DELTA, a novel method that efficiently tracks every pixel in 3D space, enabling accurate motion estimation across entire videos.<n>Our approach leverages a joint global-local attention mechanism for reduced-resolution tracking, followed by a transformer-based upsampler to achieve high-resolution predictions.<n>Our method provides a robust solution for applications requiring fine-grained, long-term motion tracking in 3D space.
arXiv Detail & Related papers (2024-10-31T17:59:01Z) - SSP-GNN: Learning to Track via Bilevel Optimization [3.1889516673296807]
We propose a graph-based tracking formulation for multi-object tracking (MOT)
Our method applies a successive shortest paths (SSP) algorithm to a tracking graph defined over a batch of frames.
The edge costs in this tracking graph are computed via a message-passing network, a graph neural network (GNN) variant.
arXiv Detail & Related papers (2024-07-05T07:23:51Z) - Exploring Dynamic Transformer for Efficient Object Tracking [58.120191254379854]
We propose DyTrack, a dynamic transformer framework for efficient tracking.<n>DyTrack automatically learns to configure proper reasoning routes for various inputs, gaining better utilization of the available computational budget.<n>Experiments on multiple benchmarks demonstrate that DyTrack achieves promising speed-precision trade-offs with only a single model.
arXiv Detail & Related papers (2024-03-26T12:31:58Z) - Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video.
We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z) - Real-time Multi-Object Tracking Based on Bi-directional Matching [0.0]
This study offers a bi-directional matching algorithm for multi-object tracking.
A stranded area is used in the matching algorithm to temporarily store the objects that fail to be tracked.
In the MOT17 challenge, the proposed algorithm achieves 63.4% MOTA, 55.3% IDF1, and 20.1 FPS tracking speed.
arXiv Detail & Related papers (2023-03-15T08:38:08Z) - Correlating sparse sensing for large-scale traffic speed estimation: A
Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging.
We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv Detail & Related papers (2022-10-21T07:25:57Z) - FlowMOT: 3D Multi-Object Tracking by Scene Flow Association [9.480272707157747]
We propose a LiDAR-based 3D MOT framework named FlowMOT, which integrates point-wise motion information with the traditional matching algorithm.
Our approach outperforms recent end-to-end methods and achieves competitive performance with the state-of-the-art filter-based method.
arXiv Detail & Related papers (2020-12-14T14:03:48Z) - Learning to Optimize Non-Rigid Tracking [54.94145312763044]
We employ learnable optimizations to improve robustness and speed up solver convergence.
First, we upgrade the tracking objective by integrating an alignment data term on deep features which are learned end-to-end through CNN.
Second, we bridge the gap between the preconditioning technique and learning method by introducing a ConditionNet which is trained to generate a preconditioner.
arXiv Detail & Related papers (2020-03-27T04:40:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.