S$^3$Track: Self-supervised Tracking with Soft Assignment Flow
- URL: http://arxiv.org/abs/2305.09981v1
- Date: Wed, 17 May 2023 06:25:40 GMT
- Title: S$^3$Track: Self-supervised Tracking with Soft Assignment Flow
- Authors: Fatemeh Azimi and Fahim Mannan and Felix Heide
- Abstract summary: We study self-supervised multiple object tracking without using any video-level association labels.
We propose differentiable soft object assignment for object association.
We evaluate our proposed model on the KITTI, nuScenes, and Argoverse datasets.
- Score: 45.77333923477176
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we study self-supervised multiple object tracking without using
any video-level association labels. We propose to cast the problem of multiple
object tracking as learning the frame-wise associations between detections in
consecutive frames. To this end, we propose differentiable soft object
assignment for object association, making it possible to learn features
tailored to object association with differentiable end-to-end training. With
this training approach in hand, we develop an appearance-based model for
learning instance-aware object features used to construct a cost matrix based
on the pairwise distances between the object features. We train our model using
temporal and multi-view data, where we obtain association pseudo-labels using
optical flow and disparity information. Unlike most self-supervised tracking
methods that rely on pretext tasks for learning the feature correspondences,
our method is directly optimized for cross-object association in complex
scenarios. As such, the proposed method offers a reidentification-based MOT
approach that is robust to training hyperparameters and does not suffer from
local minima, which are a challenge in self-supervised methods. We evaluate our
proposed model on the KITTI, Waymo, nuScenes, and Argoverse datasets,
consistently improving over other unsupervised methods ($7.8\%$ improvement in
association accuracy on nuScenes).
Related papers
- VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking [61.56592503861093]
This issue amalgamates the complexities of open-vocabulary object detection (OVD) and multi-object tracking (MOT)
Existing approaches to OVMOT often merge OVD and MOT methodologies as separate modules, predominantly focusing on the problem through an image-centric lens.
We propose VOVTrack, a novel method that integrates object states relevant to MOT and video-centric training to address this challenge from a video object tracking standpoint.
arXiv Detail & Related papers (2024-10-11T05:01:49Z) - Collecting Consistently High Quality Object Tracks with Minimal Human Involvement by Using Self-Supervised Learning to Detect Tracker Errors [16.84474849409625]
We propose a framework for consistently producing high-quality object tracks.
The key idea is to tailor a module for each dataset to intelligently decide when an object tracker is failing.
Our approach leverages self-supervised learning on unlabeled videos to learn a tailored representation for a target object.
arXiv Detail & Related papers (2024-05-06T17:06:32Z) - Object-Centric Multiple Object Tracking [124.30650395969126]
This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
arXiv Detail & Related papers (2023-09-01T03:34:12Z) - End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z) - Transformer-based assignment decision network for multiple object
tracking [0.0]
We introduce Transformer-based Assignment Decision Network (TADN) that tackles data association without the need of explicit optimization during inference.
Our proposed approach outperforms the state-of-the-art in most evaluation metrics despite its simple nature as a tracker.
arXiv Detail & Related papers (2022-08-06T19:47:32Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - Discriminative Appearance Modeling with Multi-track Pooling for
Real-time Multi-object Tracking [20.66906781151]
In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene.
Many approaches model each target in isolation and lack the ability to use all the targets in the scene to jointly update the memory.
We propose a training strategy adapted to multi-track pooling which generates hard tracking episodes online.
arXiv Detail & Related papers (2021-01-28T18:12:39Z) - SoDA: Multi-Object Tracking with Soft Data Association [75.39833486073597]
Multi-object tracking (MOT) is a prerequisite for a safe deployment of self-driving cars.
We propose a novel approach to MOT that uses attention to compute track embeddings that encode dependencies between observed objects.
arXiv Detail & Related papers (2020-08-18T03:40:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.