Self-Supervised Multi-Object Tracking with Cross-Input Consistency
- URL: http://arxiv.org/abs/2111.05943v1
- Date: Wed, 10 Nov 2021 21:00:34 GMT
- Title: Self-Supervised Multi-Object Tracking with Cross-Input Consistency
- Authors: Favyen Bastani, Songtao He, Sam Madden
- Abstract summary: We propose a self-supervised learning procedure for training a robust multi-object tracking (MOT) model given only unlabeled video.
We then compute tracks in that sequence by applying an RNN model independently on each input, and train the model to produce consistent tracks across the two inputs.
- Score: 5.8762433393846045
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a self-supervised learning procedure for training a
robust multi-object tracking (MOT) model given only unlabeled video. While
several self-supervisory learning signals have been proposed in prior work on
single-object tracking, such as color propagation and cycle-consistency, these
signals cannot be directly applied for training RNN models, which are needed to
achieve accurate MOT: they yield degenerate models that, for instance, always
match new detections to tracks with the closest initial detections. We propose
a novel self-supervisory signal that we call cross-input consistency: we
construct two distinct inputs for the same sequence of video, by hiding
different information about the sequence in each input. We then compute tracks
in that sequence by applying an RNN model independently on each input, and
train the model to produce consistent tracks across the two inputs. We evaluate
our unsupervised method on MOT17 and KITTI -- remarkably, we find that, despite
training only on unlabeled video, our unsupervised approach outperforms four
supervised methods published in the last 1--2 years, including Tracktor++,
FAMNet, GSM, and mmMOT.
Related papers
- Refining Pre-Trained Motion Models [56.18044168821188]
We take on the challenge of improving state-of-the-art supervised models with self-supervised training.
We focus on obtaining a "clean" training signal from real-world unlabelled video.
We show that our method yields reliable gains over fully-supervised methods in real videos.
arXiv Detail & Related papers (2024-01-01T18:59:33Z) - Multi-view self-supervised learning for multivariate variable-channel
time series [1.094320514634939]
We propose learning one encoder to operate on all input channels individually.
We then use a message passing neural network to extract a single representation across channels.
We show that our method, combined with the TS2Vec loss, outperforms all other methods in most settings.
arXiv Detail & Related papers (2023-07-13T19:03:06Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - IDM-Follower: A Model-Informed Deep Learning Method for Long-Sequence
Car-Following Trajectory Prediction [24.94160059351764]
Most car-following models are generative and only consider the inputs of the speed, position, and acceleration of the last time step.
We implement a novel structure with two independent encoders and a self-attention decoder that could sequentially predict the following trajectories.
Numerical experiments with multiple settings on simulation and NGSIM datasets show that the IDM-Follower can improve the prediction performance.
arXiv Detail & Related papers (2022-10-20T02:24:27Z) - MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation [104.48766162008815]
We propose and explore a new multi-modal extension of test-time adaptation for 3D semantic segmentation.
To design a framework that can take full advantage of multi-modality, each modality provides regularized self-supervisory signals to other modalities.
Our regularized pseudo labels produce stable self-learning signals in numerous multi-modal test-time adaptation scenarios.
arXiv Detail & Related papers (2022-04-27T02:28:12Z) - Unified Transformer Tracker for Object Tracking [58.65901124158068]
We present the Unified Transformer Tracker (UTT) to address tracking problems in different scenarios with one paradigm.
A track transformer is developed in our UTT to track the target in both Single Object Tracking (SOT) and Multiple Object Tracking (MOT)
arXiv Detail & Related papers (2022-03-29T01:38:49Z) - Exploring Simple 3D Multi-Object Tracking for Autonomous Driving [10.921208239968827]
3D multi-object tracking in LiDAR point clouds is a key ingredient for self-driving vehicles.
Existing methods are predominantly based on the tracking-by-detection pipeline and inevitably require a matching step for the detection association.
We present SimTrack to simplify the hand-crafted tracking paradigm by proposing an end-to-end trainable model for joint detection and tracking from raw point clouds.
arXiv Detail & Related papers (2021-08-23T17:59:22Z) - Self-Supervised Person Detection in 2D Range Data using a Calibrated
Camera [83.31666463259849]
We propose a method to automatically generate training labels (called pseudo-labels) for 2D LiDAR-based person detectors.
We show that self-supervised detectors, trained or fine-tuned with pseudo-labels, outperform detectors trained using manual annotations.
Our method is an effective way to improve person detectors during deployment without any additional labeling effort.
arXiv Detail & Related papers (2020-12-16T12:10:04Z) - A Novel Anomaly Detection Algorithm for Hybrid Production Systems based
on Deep Learning and Timed Automata [73.38551379469533]
DAD:DeepAnomalyDetection is a new approach for automatic model learning and anomaly detection in hybrid production systems.
It combines deep learning and timed automata for creating behavioral model from observations.
The algorithm has been applied to few data sets including two from real systems and has shown promising results.
arXiv Detail & Related papers (2020-10-29T08:27:43Z) - Multi-object tracking with self-supervised associating network [5.947279761429668]
We propose a novel self-supervised learning method using a lot of short videos which has no human labeling.
Despite the re-identification network is trained in a self-supervised manner, it achieves the state-of-the-art performance of MOTA 62.0% and IDF1 62.6% on the MOT17 test benchmark.
arXiv Detail & Related papers (2020-10-26T08:48:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.