QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple
Object Tracking
- URL: http://arxiv.org/abs/2210.06984v2
- Date: Wed, 27 Sep 2023 12:39:30 GMT
- Title: QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple
Object Tracking
- Authors: Tobias Fischer, Thomas E. Huang, Jiangmiao Pang, Linlu Qiu, Haofeng
Chen, Trevor Darrell, Fisher Yu
- Abstract summary: We present Quasi-Dense Similarity Learning, which densely samples hundreds of object regions on a pair of images for contrastive learning.
We find that the resulting distinctive feature space admits a simple nearest neighbor search at inference time for object association.
We show that our similarity learning scheme is not limited to video data, but can learn effective instance similarity even from static input.
- Score: 73.52284039530261
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Similarity learning has been recognized as a crucial step for object
tracking. However, existing multiple object tracking methods only use sparse
ground truth matching as the training objective, while ignoring the majority of
the informative regions in images. In this paper, we present Quasi-Dense
Similarity Learning, which densely samples hundreds of object regions on a pair
of images for contrastive learning. We combine this similarity learning with
multiple existing object detectors to build Quasi-Dense Tracking (QDTrack),
which does not require displacement regression or motion priors. We find that
the resulting distinctive feature space admits a simple nearest neighbor search
at inference time for object association. In addition, we show that our
similarity learning scheme is not limited to video data, but can learn
effective instance similarity even from static input, enabling a competitive
tracking performance without training on videos or using tracking supervision.
We conduct extensive experiments on a wide variety of popular MOT benchmarks.
We find that, despite its simplicity, QDTrack rivals the performance of
state-of-the-art tracking methods on all benchmarks and sets a new
state-of-the-art on the large-scale BDD100K MOT benchmark, while introducing
negligible computational overhead to the detector.
Related papers
- Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking [52.04679257903805]
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks.
Our tracker, named TCBTrack, achieves state-of-the-art performance on multiple public benchmarks.
arXiv Detail & Related papers (2024-07-19T07:48:45Z) - Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - Real-time Multi-Object Tracking Based on Bi-directional Matching [0.0]
This study offers a bi-directional matching algorithm for multi-object tracking.
A stranded area is used in the matching algorithm to temporarily store the objects that fail to be tracked.
In the MOT17 challenge, the proposed algorithm achieves 63.4% MOTA, 55.3% IDF1, and 20.1 FPS tracking speed.
arXiv Detail & Related papers (2023-03-15T08:38:08Z) - Unifying Tracking and Image-Video Object Detection [54.91658924277527]
TrIVD (Tracking and Image-Video Detection) is the first framework that unifies image OD, video OD, and MOT within one end-to-end model.
To handle the discrepancies and semantic overlaps of category labels, TrIVD formulates detection/tracking as grounding and reasons about object categories.
arXiv Detail & Related papers (2022-11-20T20:30:28Z) - Semi-TCL: Semi-Supervised Track Contrastive Representation Learning [40.31083437957288]
We design a new instance-to-track matching objective to learn appearance embedding.
It compares a candidate detection to the embedding of the tracks persisted in the tracker.
We implement this learning objective in a unified form following the spirit of constrastive loss.
arXiv Detail & Related papers (2021-07-06T05:23:30Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - DEFT: Detection Embeddings for Tracking [3.326320568999945]
We propose an efficient joint detection and tracking model named DEFT.
Our approach relies on an appearance-based object matching network jointly-learned with an underlying object detection network.
DEFT has comparable accuracy and speed to the top methods on 2D online tracking leaderboards.
arXiv Detail & Related papers (2021-02-03T20:00:44Z) - Learning to associate detections for real-time multiple object tracking [0.0]
This study investigates the use of artificial neural networks to learn a similarity function that can be used among detections.
The proposed tracker matches the results obtained by state-of-the-art methods, it has run 58% faster than a recent and similar method, used as baseline.
arXiv Detail & Related papers (2020-07-12T17:08:41Z) - Quasi-Dense Similarity Learning for Multiple Object Tracking [82.93471035675299]
We present Quasi-Dense Similarity Learning, which densely samples hundreds of region proposals on a pair of images for contrastive learning.
We can directly combine this similarity learning with existing detection methods to build Quasi-Dense Tracking (QDTrack)
arXiv Detail & Related papers (2020-06-11T17:57:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.