Self-Supervised Multi-Object Tracking with Path Consistency
- URL: http://arxiv.org/abs/2404.05136v1
- Date: Mon, 8 Apr 2024 01:29:10 GMT
- Title: Self-Supervised Multi-Object Tracking with Path Consistency
- Authors: Zijia Lu, Bing Shuai, Yanbei Chen, Zhenlin Xu, Davide Modolo,
- Abstract summary: We propose a novel concept of path consistency to learn robust object matching without using manual object identity supervision.
We generate multiple observation paths, each specifying a different set of frames to be skipped, and formulate the Path Consistency Loss that enforces the association results are consistent across different observation paths.
- Score: 28.923565712817645
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a novel concept of path consistency to learn robust object matching without using manual object identity supervision. Our key idea is that, to track a object through frames, we can obtain multiple different association results from a model by varying the frames it can observe, i.e., skipping frames in observation. As the differences in observations do not alter the identities of objects, the obtained association results should be consistent. Based on this rationale, we generate multiple observation paths, each specifying a different set of frames to be skipped, and formulate the Path Consistency Loss that enforces the association results are consistent across different observation paths. We use the proposed loss to train our object matching model with only self-supervision. By extensive experiments on three tracking datasets (MOT17, PersonPath22, KITTI), we demonstrate that our method outperforms existing unsupervised methods with consistent margins on various evaluation metrics, and even achieves performance close to supervised methods.
Related papers
- Self-Supervised Any-Point Tracking by Contrastive Random Walks [17.50529887238381]
We train a global matching transformer to find cycle consistent tracks through video via contrastive random walks.
Our method achieves strong performance on the TapVid benchmarks, outperforming previous self-supervised tracking methods.
arXiv Detail & Related papers (2024-09-24T17:59:56Z) - Counterfactual Reasoning for Multi-Label Image Classification via Patching-Based Training [84.95281245784348]
Overemphasizing co-occurrence relationships can cause the overfitting issue of the model.
We provide a causal inference framework to show that the correlative features caused by the target object and its co-occurring objects can be regarded as a mediator.
arXiv Detail & Related papers (2024-04-09T13:13:24Z) - Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - S$^3$Track: Self-supervised Tracking with Soft Assignment Flow [45.77333923477176]
We study self-supervised multiple object tracking without using any video-level association labels.
We propose differentiable soft object assignment for object association.
We evaluate our proposed model on the KITTI, nuScenes, and Argoverse datasets.
arXiv Detail & Related papers (2023-05-17T06:25:40Z) - Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream.
At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank.
To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - Improving tracking with a tracklet associator [17.839783649372116]
Multiple object tracking (MOT) is a task in computer vision that aims to detect the position of objects in videos and to associate them to a unique identity.
We propose an approach based on Constraint Programming (CP) whose goal is to be grafted to any existing tracker in order to improve its object association results.
arXiv Detail & Related papers (2022-04-22T12:47:46Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - Multi-object Tracking with a Hierarchical Single-branch Network [31.680667324595557]
We propose an online multi-object tracking framework based on a hierarchical single-branch network.
Our novel iHOIM loss function unifies the objectives of the two sub-tasks and encourages better detection performance.
Experimental results on MOT16 and MOT20 datasets show that we can achieve state-of-the-art tracking performance.
arXiv Detail & Related papers (2021-01-06T12:14:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.