Simple Unsupervised Multi-Object Tracking
- URL: http://arxiv.org/abs/2006.02609v1
- Date: Thu, 4 Jun 2020 01:53:18 GMT
- Title: Simple Unsupervised Multi-Object Tracking
- Authors: Shyamgopal Karthik, Ameya Prabhu, Vineet Gandhi
- Abstract summary: In this work, we propose an unsupervised re-identification network, thus sidestepping the labeling costs entirely.
Given unlabeled videos, our proposed method (SimpleReID) first generates tracking labels using SORT and trains a ReID network to predict the generated labels using crossentropy loss.
We establish a new state-of-the-art performance on popular datasets like MOT16/17 without using tracking supervision, beating current best (CenterTrack) by 0.2-0.3 MOTA and 4.4-4.8 IDF1 scores.
- Score: 11.640210313011876
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-object tracking has seen a lot of progress recently, albeit with
substantial annotation costs for developing better and larger labeled datasets.
In this work, we remove the need for annotated datasets by proposing an
unsupervised re-identification network, thus sidestepping the labeling costs
entirely, required for training. Given unlabeled videos, our proposed method
(SimpleReID) first generates tracking labels using SORT and trains a ReID
network to predict the generated labels using crossentropy loss. We demonstrate
that SimpleReID performs substantially better than simpler alternatives, and we
recover the full performance of its supervised counterpart consistently across
diverse tracking frameworks. The observations are unusual because unsupervised
ReID is not expected to excel in crowded scenarios with occlusions, and drastic
viewpoint changes. By incorporating our unsupervised SimpleReID with
CenterTrack trained on augmented still images, we establish a new
state-of-the-art performance on popular datasets like MOT16/17 without using
tracking supervision, beating current best (CenterTrack) by 0.2-0.3 MOTA and
4.4-4.8 IDF1 scores. We further provide evidence for limited scope for
improvement in IDF1 scores beyond our unsupervised ReID in the studied
settings. Our investigation suggests reconsideration towards more
sophisticated, supervised, end-to-end trackers by showing promise in simpler
unsupervised alternatives.
Related papers
- Weakly Supervised Video Individual CountingWeakly Supervised Video
Individual Counting [126.75545291243142]
Video Individual Counting aims to predict the number of unique individuals in a single video.
We introduce a weakly supervised VIC task, wherein trajectory labels are not provided.
In doing so, we devise an end-to-end trainable soft contrastive loss to drive the network to distinguish inflow, outflow, and the remaining.
arXiv Detail & Related papers (2023-12-10T16:12:13Z) - SOMPT22: A Surveillance Oriented Multi-Pedestrian Tracking Dataset [5.962184741057505]
We introduce SOMPT22 dataset; a new set for multi person tracking with annotated short videos captured from static cameras located on poles with 6-8 meters in height positioned for city surveillance.
We analyze MOT trackers classified as one-shot and two-stage with respect to the way of use of detection and reID networks on this new dataset.
The experimental results of our new dataset indicate that SOTA is still far from high efficiency, and single-shot trackers are good candidates to unify fast execution and accuracy with competitive performance.
arXiv Detail & Related papers (2022-08-04T11:09:19Z) - Unsupervised Learning of Accurate Siamese Tracking [68.58171095173056]
We present a novel unsupervised tracking framework, in which we can learn temporal correspondence both on the classification branch and regression branch.
Our tracker outperforms preceding unsupervised methods by a substantial margin, performing on par with supervised methods on large-scale datasets such as TrackingNet and LaSOT.
arXiv Detail & Related papers (2022-04-04T13:39:43Z) - Learning to Track Objects from Unlabeled Videos [63.149201681380305]
In this paper, we propose to learn an Unsupervised Single Object Tracker (USOT) from scratch.
To narrow the gap between unsupervised trackers and supervised counterparts, we propose an effective unsupervised learning approach composed of three stages.
Experiments show that the proposed USOT learned from unlabeled videos performs well over the state-of-the-art unsupervised trackers by large margins.
arXiv Detail & Related papers (2021-08-28T22:10:06Z) - Video-based Person Re-identification without Bells and Whistles [49.51670583977911]
Video-based person re-identification (Re-ID) aims at matching the video tracklets with cropped video frames for identifying the pedestrians under different cameras.
There exists severe spatial and temporal misalignment for those cropped tracklets due to the imperfect detection and tracking results generated with obsolete methods.
We present a simple re-Detect and Link (DL) module which can effectively reduce those unexpected noise through applying the deep learning-based detection and tracking on the cropped tracklets.
arXiv Detail & Related papers (2021-05-22T10:17:38Z) - Unsupervised Deep Representation Learning for Real-Time Tracking [137.69689503237893]
We propose an unsupervised learning method for visual tracking.
The motivation of our unsupervised learning is that a robust tracker should be effective in bidirectional tracking.
We build our framework on a Siamese correlation filter network, and propose a multi-frame validation scheme and a cost-sensitive loss to facilitate unsupervised learning.
arXiv Detail & Related papers (2020-07-22T08:23:12Z) - Cascaded Regression Tracking: Towards Online Hard Distractor
Discrimination [202.2562153608092]
We propose a cascaded regression tracker with two sequential stages.
In the first stage, we filter out abundant easily-identified negative candidates.
In the second stage, a discrete sampling based ridge regression is designed to double-check the remaining ambiguous hard samples.
arXiv Detail & Related papers (2020-06-18T07:48:01Z) - Unsupervised Multiple Person Tracking using AutoEncoder-Based Lifted
Multicuts [11.72025865314187]
We present an unsupervised multiple object tracking approach based on minimum visual features and lifted multicuts.
We show that, despite being trained without using the provided annotations, our model provides competitive results on the challenging MOT Benchmark for pedestrian tracking.
arXiv Detail & Related papers (2020-02-04T09:42:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.