Joint Object Detection and Multi-Object Tracking with Graph Neural
Networks
- URL: http://arxiv.org/abs/2006.13164v3
- Date: Sat, 3 Apr 2021 13:32:03 GMT
- Title: Joint Object Detection and Multi-Object Tracking with Graph Neural
Networks
- Authors: Yongxin Wang and Kris Kitani and Xinshuo Weng
- Abstract summary: We propose a new instance of joint MOT approach based on Graph Neural Networks (GNNs)
We show the effectiveness of our GNN-based joint MOT approach and show state-of-the-art performance for both detection and MOT tasks.
- Score: 32.1359455541169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object detection and data association are critical components in multi-object
tracking (MOT) systems. Despite the fact that the two components are dependent
on each other, prior works often design detection and data association modules
separately which are trained with separate objectives. As a result, one cannot
back-propagate the gradients and optimize the entire MOT system, which leads to
sub-optimal performance. To address this issue, recent works simultaneously
optimize detection and data association modules under a joint MOT framework,
which has shown improved performance in both modules. In this work, we propose
a new instance of joint MOT approach based on Graph Neural Networks (GNNs). The
key idea is that GNNs can model relations between variable-sized objects in
both the spatial and temporal domains, which is essential for learning
discriminative features for detection and data association. Through extensive
experiments on the MOT15/16/17/20 datasets, we demonstrate the effectiveness of
our GNN-based joint MOT approach and show state-of-the-art performance for both
detection and MOT tasks. Our code is available at:
https://github.com/yongxinw/GSDT
Related papers
- STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking [13.269416985959404]
Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is important for diverse applications in computer vision.
We propose a novel Spatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT)
We use historical embedding features to model the representation of ReID and detection features in a sequential order.
Our framework sets a new state-of-the-art performance in MOTA and IDF1 metrics.
arXiv Detail & Related papers (2024-09-17T14:34:18Z) - UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with
Geometric Topology Guidance [6.577227592760559]
UnsMOT is a novel framework that combines appearance and motion features of objects with geometric information to provide more accurate tracking.
Experimental results show remarkable performance in terms of HOTA, IDF1, and MOTA metrics in comparison with state-of-the-art methods.
arXiv Detail & Related papers (2023-09-03T04:58:12Z) - Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets.
We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models.
Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - NL-FCOS: Improving FCOS through Non-Local Modules for Object Detection [0.0]
We show that non-local modules combined with an FCOS head (NL-FCOS) are practical and efficient.
We establish state-of-the-art performance in clothing detection and handwritten amount recognition problems.
arXiv Detail & Related papers (2022-03-29T15:00:14Z) - Online Multiple Object Tracking with Cross-Task Synergy [120.70085565030628]
We propose a novel unified model with synergy between position prediction and embedding association.
The two tasks are linked by temporal-aware target attention and distractor attention, as well as identity-aware memory aggregation model.
arXiv Detail & Related papers (2021-04-01T10:19:40Z) - Simultaneous Detection and Tracking with Motion Modelling for Multiple
Object Tracking [94.24393546459424]
We introduce Deep Motion Modeling Network (DMM-Net) that can estimate multiple objects' motion parameters to perform joint detection and association.
DMM-Net achieves PR-MOTA score of 12.80 @ 120+ fps for the popular UA-DETRAC challenge, which is better performance and orders of magnitude faster.
We also contribute a synthetic large-scale public dataset Omni-MOT for vehicle tracking that provides precise ground-truth annotations.
arXiv Detail & Related papers (2020-08-20T08:05:33Z) - Hierarchical Dynamic Filtering Network for RGB-D Salient Object
Detection [91.43066633305662]
The main purpose of RGB-D salient object detection (SOD) is how to better integrate and utilize cross-modal fusion information.
In this paper, we explore these issues from a new perspective.
We implement a kind of more flexible and efficient multi-scale cross-modal feature processing.
arXiv Detail & Related papers (2020-07-13T07:59:55Z) - A Unified Object Motion and Affinity Model for Online Multi-Object
Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA.
UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning.
We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.