Object-Centric Multiple Object Tracking
- URL: http://arxiv.org/abs/2309.00233v2
- Date: Tue, 5 Sep 2023 07:22:43 GMT
- Title: Object-Centric Multiple Object Tracking
- Authors: Zixu Zhao, Jiaze Wang, Max Horn, Yizhuo Ding, Tong He, Zechen Bai,
Dominik Zietlow, Carl-Johann Simon-Gabriel, Bing Shuai, Zhuowen Tu, Thomas
Brox, Bernt Schiele, Yanwei Fu, Francesco Locatello, Zheng Zhang, Tianjun
Xiao
- Abstract summary: This paper proposes a video object-centric model for multiple-object tracking pipelines.
It consists of an index-merge module that adapts the object-centric slots into detection outputs and an object memory module.
Benefited from object-centric learning, we only require sparse detection labels for object localization and feature binding.
- Score: 124.30650395969126
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised object-centric learning methods allow the partitioning of scenes
into entities without additional localization information and are excellent
candidates for reducing the annotation burden of multiple-object tracking (MOT)
pipelines. Unfortunately, they lack two key properties: objects are often split
into parts and are not consistently tracked over time. In fact,
state-of-the-art models achieve pixel-level accuracy and temporal consistency
by relying on supervised object detection with additional ID labels for the
association through time. This paper proposes a video object-centric model for
MOT. It consists of an index-merge module that adapts the object-centric slots
into detection outputs and an object memory module that builds complete object
prototypes to handle occlusions. Benefited from object-centric learning, we
only require sparse detection labels (0%-6.25%) for object localization and
feature binding. Relying on our self-supervised
Expectation-Maximization-inspired loss for object association, our approach
requires no ID labels. Our experiments significantly narrow the gap between the
existing object-centric model and the fully supervised state-of-the-art and
outperform several unsupervised trackers.
Related papers
- Collecting Consistently High Quality Object Tracks with Minimal Human Involvement by Using Self-Supervised Learning to Detect Tracker Errors [16.84474849409625]
We propose a framework for consistently producing high-quality object tracks.
The key idea is to tailor a module for each dataset to intelligently decide when an object tracker is failing.
Our approach leverages self-supervised learning on unlabeled videos to learn a tailored representation for a target object.
arXiv Detail & Related papers (2024-05-06T17:06:32Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Image Segmentation-based Unsupervised Multiple Objects Discovery [1.7674345486888503]
Unsupervised object discovery aims to localize objects in images.
We propose a fully unsupervised, bottom-up approach, for multiple objects discovery.
We provide state-of-the-art results for both unsupervised class-agnostic object detection and unsupervised image segmentation.
arXiv Detail & Related papers (2022-12-20T09:48:24Z) - 4D Unsupervised Object Discovery [53.561750858325915]
We propose 4D unsupervised object discovery, jointly discovering objects from 4D data -- 3D point clouds and 2D RGB images with temporal information.
We present the first practical approach for this task by proposing a ClusterNet on 3D point clouds, which is jointly optimized with a 2D localization network.
arXiv Detail & Related papers (2022-10-10T16:05:53Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Scribble-based Boundary-aware Network for Weakly Supervised Salient
Object Detection in Remote Sensing Images [10.628932392896374]
We propose a novel weakly-supervised salient object detection framework to predict the saliency of remote sensing images from sparse scribble annotations.
Specifically, we design a boundary-aware module (BAM) to explore the object boundary semantics, which is explicitly supervised by the high-confidence object boundary (pseudo) labels.
Then, the boundary semantics are integrated with high-level features to guide the salient object detection under the supervision of scribble labels.
arXiv Detail & Related papers (2022-02-07T20:32:21Z) - Online Multi-Object Tracking with Unsupervised Re-Identification
Learning and Occlusion Estimation [80.38553821508162]
Occlusion between different objects is a typical challenge in Multi-Object Tracking (MOT)
In this paper, we focus on online multi-object tracking and design two novel modules to handle these problems.
The proposed unsupervised re-identification learning module does not require any (pseudo) identity information nor suffer from the scalability issue.
Our study shows that, when applied to state-of-the-art MOT methods, the proposed unsupervised re-identification learning is comparable to supervised re-identification learning.
arXiv Detail & Related papers (2022-01-04T18:59:58Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - IA-MOT: Instance-Aware Multi-Object Tracking with Motion Consistency [40.354708148590696]
"instance-aware MOT" (IA-MOT) can track multiple objects in either static or moving cameras.
Our proposed method won the first place in Track 3 of the BMTT Challenge in CVPR 2020 workshops.
arXiv Detail & Related papers (2020-06-24T03:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.