Discriminative Appearance Modeling with Multi-track Pooling for
Real-time Multi-object Tracking
- URL: http://arxiv.org/abs/2101.12159v1
- Date: Thu, 28 Jan 2021 18:12:39 GMT
- Title: Discriminative Appearance Modeling with Multi-track Pooling for
Real-time Multi-object Tracking
- Authors: Chanho Kim, Li Fuxin, Mazen Alotaibi, James M. Rehg
- Abstract summary: In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene.
Many approaches model each target in isolation and lack the ability to use all the targets in the scene to jointly update the memory.
We propose a training strategy adapted to multi-track pooling which generates hard tracking episodes online.
- Score: 20.66906781151
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multi-object tracking, the tracker maintains in its memory the appearance
and motion information for each object in the scene. This memory is utilized
for finding matches between tracks and detections and is updated based on the
matching result. Many approaches model each target in isolation and lack the
ability to use all the targets in the scene to jointly update the memory. This
can be problematic when there are similar looking objects in the scene. In this
paper, we solve the problem of simultaneously considering all tracks during
memory updating, with only a small spatial overhead, via a novel multi-track
pooling module. We additionally propose a training strategy adapted to
multi-track pooling which generates hard tracking episodes online. We show that
the combination of these innovations results in a strong discriminative
appearance model, enabling the use of greedy data association to achieve online
tracking performance. Our experiments demonstrate real-time, state-of-the-art
performance on public multi-object tracking (MOT) datasets.
Related papers
- Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking [55.13878429987136]
We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
arXiv Detail & Related papers (2023-11-17T08:17:49Z) - ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera
Multi-Object Tracking [11.619493960418176]
Multi-Camera Multi-Object Tracking (MC-MOT) utilizes information from multiple views to better handle problems with occlusion and crowded scenes.
Current graph-based methods do not effectively utilize information regarding spatial and temporal consistency.
We propose a novel reconfigurable graph model that first associates all detected objects across cameras spatially before reconfiguring it into a temporal graph.
arXiv Detail & Related papers (2023-08-25T08:02:04Z) - DIVOTrack: A Novel Dataset and Baseline Method for Cross-View
Multi-Object Tracking in DIVerse Open Scenes [74.64897845999677]
We introduce a new cross-view multi-object tracking dataset for DIVerse Open scenes with dense tracking pedestrians.
Our DIVOTrack has fifteen distinct scenarios and 953 cross-view tracks, surpassing all cross-view multi-object tracking datasets currently available.
Furthermore, we provide a novel baseline cross-view tracking method with a unified joint detection and cross-view tracking framework named CrossMOT.
arXiv Detail & Related papers (2023-02-15T14:10:42Z) - DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse
Motion [56.1428110894411]
We propose a large-scale dataset for multi-human tracking, where humans have similar appearance, diverse motion and extreme articulation.
As the dataset contains mostly group dancing videos, we name it "DanceTrack"
We benchmark several state-of-the-art trackers on our dataset and observe a significant performance drop on DanceTrack when compared against existing benchmarks.
arXiv Detail & Related papers (2021-11-29T16:49:06Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - SoDA: Multi-Object Tracking with Soft Data Association [75.39833486073597]
Multi-object tracking (MOT) is a prerequisite for a safe deployment of self-driving cars.
We propose a novel approach to MOT that uses attention to compute track embeddings that encode dependencies between observed objects.
arXiv Detail & Related papers (2020-08-18T03:40:25Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.