Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking
from View Aggregation
- URL: http://arxiv.org/abs/2011.12850v1
- Date: Wed, 25 Nov 2020 16:14:40 GMT
- Title: Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking
from View Aggregation
- Authors: Can Chen, Luca Zanotti Fragonara and Antonios Tsourdos
- Abstract summary: 3D multi-object tracking plays a vital role in autonomous navigation.
Many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space.
We propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames.
- Score: 8.854112907350624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous systems need to localize and track surrounding objects in 3D space
for safe motion planning. As a result, 3D multi-object tracking (MOT) plays a
vital role in autonomous navigation. Most MOT methods use a
tracking-by-detection pipeline, which includes object detection and data
association processing. However, many approaches detect objects in 2D RGB
sequences for tracking, which is lack of reliability when localizing objects in
3D space. Furthermore, it is still challenging to learn discriminative features
for temporally-consistent detection in different frames, and the affinity
matrix is normally learned from independent object features without considering
the feature interaction between detected objects in the different frames. To
settle these problems, We firstly employ a joint feature extractor to fuse the
2D and 3D appearance features captured from both 2D RGB images and 3D point
clouds respectively, and then propose a novel convolutional operation, named
RelationConv, to better exploit the correlation between each pair of objects in
the adjacent frames, and learn a deep affinity matrix for further data
association. We finally provide extensive evaluation to reveal that our
proposed model achieves state-of-the-art performance on KITTI tracking
benchmark.
Related papers
- 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - TripletTrack: 3D Object Tracking using Triplet Embeddings and LSTM [0.0]
3D object tracking is a critical task in autonomous driving systems.
In this paper we investigate the use of triplet embeddings in combination with motion representations for 3D object tracking.
arXiv Detail & Related papers (2022-10-28T15:23:50Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for
Autonomous Driving [3.8073142980733]
We propose jointly training 3D detection and 3D tracking from only monocular videos in an end-to-end manner.
Time3D achieves 21.4% AMOTA, 13.6% AMOTP on the nuScenes 3D tracking benchmark, surpassing all published competitors.
arXiv Detail & Related papers (2022-05-30T06:41:10Z) - Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information.
Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z) - Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
Temporal Sentence Grounding [61.57847727651068]
Temporal sentence grounding aims to localize a target segment in an untrimmed video semantically according to a given sentence query.
Most previous works focus on learning frame-level features of each whole frame in the entire video, and directly match them with the textual information.
We propose a novel Motion- and Appearance-guided 3D Semantic Reasoning Network (MA3SRN), which incorporates optical-flow-guided motion-aware, detection-based appearance-aware, and 3D-aware object-level features.
arXiv Detail & Related papers (2022-03-06T13:57:09Z) - M3DSSD: Monocular 3D Single Stage Object Detector [82.25793227026443]
We propose a Monocular 3D Single Stage object Detector (M3DSSD) with feature alignment and asymmetric non-local attention.
The proposed M3DSSD achieves significantly better performance than the monocular 3D object detection methods on the KITTI dataset.
arXiv Detail & Related papers (2021-03-24T13:09:11Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - A two-stage data association approach for 3D Multi-object Tracking [0.0]
We adapt a two-stage dataassociation method which was successful in image-based tracking to the 3D setting.
Our method outperforms the baseline using one-stagebipartie matching for data association by achieving 0.587 AMOTA in NuScenes validation set.
arXiv Detail & Related papers (2021-01-21T15:50:17Z) - Graph Neural Networks for 3D Multi-Object Tracking [28.121708602059048]
3D Multi-object tracking (MOT) is crucial to autonomous systems.
Recent work often uses a tracking-by-detection pipeline.
We propose a novel feature interaction mechanism by introducing Graph Neural Networks.
arXiv Detail & Related papers (2020-08-20T17:55:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.