GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with
Multi-Feature Learning
- URL: http://arxiv.org/abs/2006.07327v1
- Date: Fri, 12 Jun 2020 17:08:14 GMT
- Title: GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with
Multi-Feature Learning
- Authors: Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani
- Abstract summary: 3D Multi-object tracking (MOT) is crucial to autonomous systems.
We propose two techniques to improve the discriminative feature learning for MOT.
Our proposed method achieves state-of-the-art performance on KITTI and nuScenes 3D MOT benchmarks.
- Score: 30.72094639797806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D Multi-object tracking (MOT) is crucial to autonomous systems. Recent work
uses a standard tracking-by-detection pipeline, where feature extraction is
first performed independently for each object in order to compute an affinity
matrix. Then the affinity matrix is passed to the Hungarian algorithm for data
association. A key process of this standard pipeline is to learn discriminative
features for different objects in order to reduce confusion during data
association. In this work, we propose two techniques to improve the
discriminative feature learning for MOT: (1) instead of obtaining features for
each object independently, we propose a novel feature interaction mechanism by
introducing the Graph Neural Network. As a result, the feature of one object is
informed of the features of other objects so that the object feature can lean
towards the object with similar feature (i.e., object probably with a same ID)
and deviate from objects with dissimilar features (i.e., object probably with
different IDs), leading to a more discriminative feature for each object; (2)
instead of obtaining the feature from either 2D or 3D space in prior work, we
propose a novel joint feature extractor to learn appearance and motion features
from 2D and 3D space simultaneously. As features from different modalities
often have complementary information, the joint feature can be more
discriminate than feature from each individual modality. To ensure that the
joint feature extractor does not heavily rely on one modality, we also propose
an ensemble training paradigm. Through extensive evaluation, our proposed
method achieves state-of-the-art performance on KITTI and nuScenes 3D MOT
benchmarks. Our code will be made available at
https://github.com/xinshuoweng/GNN3DMOT
Related papers
- ROAM: Robust and Object-Aware Motion Generation Using Neural Pose
Descriptors [73.26004792375556]
This paper shows that robustness and generalisation to novel scene objects in 3D object-aware character synthesis can be achieved by training a motion model with as few as one reference object.
We leverage an implicit feature representation trained on object-only datasets, which encodes an SE(3)-equivariant descriptor field around the object.
We demonstrate substantial improvements in 3D virtual character motion and interaction quality and robustness to scenarios with unseen objects.
arXiv Detail & Related papers (2023-08-24T17:59:51Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - The Devil is in the Task: Exploiting Reciprocal Appearance-Localization
Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving.
We introduce a Dynamic Feature Reflecting Network, named DFR-Net.
We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z) - Learning Feature Aggregation for Deep 3D Morphable Models [57.1266963015401]
We propose an attention based module to learn mapping matrices for better feature aggregation across hierarchical levels.
Our experiments show that through the end-to-end training of the mapping matrices, we achieve state-of-the-art results on a variety of 3D shape datasets.
arXiv Detail & Related papers (2021-05-05T16:41:00Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object
Detection [39.64891219500416]
3D object detection methods exploit either voxel-based or point-based features to represent 3D objects in a scene.
We introduce in this paper a novel single-stage 3D detection method having the merit of both voxel-based and point-based features.
arXiv Detail & Related papers (2021-04-02T06:34:49Z) - SA-Det3D: Self-Attention Based Context-Aware 3D Object Detection [9.924083358178239]
We propose two variants of self-attention for contextual modeling in 3D object detection.
We first incorporate the pairwise self-attention mechanism into the current state-of-the-art BEV, voxel and point-based detectors.
Next, we propose a self-attention variant that samples a subset of the most representative features by learning deformations over randomly sampled locations.
arXiv Detail & Related papers (2021-01-07T18:30:32Z) - Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking
from View Aggregation [8.854112907350624]
3D multi-object tracking plays a vital role in autonomous navigation.
Many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space.
We propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames.
arXiv Detail & Related papers (2020-11-25T16:14:40Z) - End-to-End 3D Multi-Object Tracking and Trajectory Forecasting [34.68114553744956]
We propose a unified solution for 3D MOT and trajectory forecasting.
We employ a feature interaction technique by introducing Graph Neural Networks.
We also use a diversity sampling function to improve the quality and diversity of our forecasted trajectories.
arXiv Detail & Related papers (2020-08-25T16:54:46Z) - Graph Neural Networks for 3D Multi-Object Tracking [28.121708602059048]
3D Multi-object tracking (MOT) is crucial to autonomous systems.
Recent work often uses a tracking-by-detection pipeline.
We propose a novel feature interaction mechanism by introducing Graph Neural Networks.
arXiv Detail & Related papers (2020-08-20T17:55:41Z) - D3Feat: Joint Learning of Dense Detection and Description of 3D Local
Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds.
We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.