MotionTrack: End-to-End Transformer-based Multi-Object Tracing with
LiDAR-Camera Fusion
- URL: http://arxiv.org/abs/2306.17000v1
- Date: Thu, 29 Jun 2023 15:00:12 GMT
- Title: MotionTrack: End-to-End Transformer-based Multi-Object Tracing with
LiDAR-Camera Fusion
- Authors: Ce Zhang, Chengjie Zhang, Yiluan Guo, Lingji Chen, Michael Happold
- Abstract summary: We propose an end-to-end transformer-based MOT algorithm (MotionTrack) with multi-modality sensor inputs to track objects with multiple classes.
The MotionTrack and its variations achieve better results (AMOTA score at 0.55) on the nuScenes dataset compared with other classical baseline models.
- Score: 13.125168307241765
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Multiple Object Tracking (MOT) is crucial to autonomous vehicle perception.
End-to-end transformer-based algorithms, which detect and track objects
simultaneously, show great potential for the MOT task. However, most existing
methods focus on image-based tracking with a single object category. In this
paper, we propose an end-to-end transformer-based MOT algorithm (MotionTrack)
with multi-modality sensor inputs to track objects with multiple classes. Our
objective is to establish a transformer baseline for the MOT in an autonomous
driving environment. The proposed algorithm consists of a transformer-based
data association (DA) module and a transformer-based query enhancement module
to achieve MOT and Multiple Object Detection (MOD) simultaneously. The
MotionTrack and its variations achieve better results (AMOTA score at 0.55) on
the nuScenes dataset compared with other classical baseline models, such as the
AB3DMOT, the CenterTrack, and the probabilistic 3D Kalman filter. In addition,
we prove that a modified attention mechanism can be utilized for DA to
accomplish the MOT, and aggregate history features to enhance the MOD
performance.
Related papers
- Multi-Object Tracking with Camera-LiDAR Fusion for Autonomous Driving [0.764971671709743]
The proposed MOT algorithm comprises a three-step association process, an Extended Kalman filter for estimating the motion of each detected dynamic obstacle, and a track management phase.
Unlike most state-of-the-art multi-modal MOT approaches, the proposed algorithm does not rely on maps or knowledge of the ego global pose.
The algorithm is validated both in simulation and with real-world data, with satisfactory results.
arXiv Detail & Related papers (2024-03-06T23:49:16Z) - Strong-TransCenter: Improved Multi-Object Tracking based on Transformers
with Dense Representations [1.2891210250935146]
TransCenter is a transformer-based MOT architecture with dense object queries for accurately tracking all the objects.
This paper shows an improvement to this tracker using post processing mechanism based in the Track-by-Detection paradigm.
Our new tracker shows significant improvements in the IDF1 and HOTA metrics and comparable results on the MOTA metric.
arXiv Detail & Related papers (2022-10-24T19:47:58Z) - InterTrack: Interaction Transformer for 3D Multi-Object Tracking [9.283656931246645]
3D multi-object tracking (MOT) is a key problem for autonomous vehicles.
Our proposed solution, InterTrack, generates discriminative object representations for data association.
We validate our approach on the nuScenes 3D MOT benchmark, where we observe significant improvements.
arXiv Detail & Related papers (2022-08-17T03:24:36Z) - Joint Spatial-Temporal and Appearance Modeling with Transformer for
Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects.
The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z) - Unified Transformer Tracker for Object Tracking [58.65901124158068]
We present the Unified Transformer Tracker (UTT) to address tracking problems in different scenarios with one paradigm.
A track transformer is developed in our UTT to track the target in both Single Object Tracking (SOT) and Multiple Object Tracking (MOT)
arXiv Detail & Related papers (2022-03-29T01:38:49Z) - Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality
Collaboration [56.01625477187448]
We propose a MultiModality PAnoramic multi-object Tracking framework (MMPAT)
It takes both 2D panorama images and 3D point clouds as input and then infers target trajectories using the multimodality data.
We evaluate the proposed method on the JRDB dataset, where the MMPAT achieves the top performance in both the detection and tracking tasks.
arXiv Detail & Related papers (2021-05-31T03:16:38Z) - TransMOT: Spatial-Temporal Graph Transformer for Multiple Object
Tracking [74.82415271960315]
We propose a solution named TransMOT to efficiently model the spatial and temporal interactions among objects in a video.
TransMOT is not only more computationally efficient than the traditional Transformer, but it also achieves better tracking accuracy.
The proposed method is evaluated on multiple benchmark datasets including MOT15, MOT16, MOT17, and MOT20.
arXiv Detail & Related papers (2021-04-01T01:49:05Z) - Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous
Driving [22.693895321632507]
We propose a probabilistic, multi-modal, multi-object tracking system consisting of different trainable modules.
We show that our method outperforms current state-of-the-art on the NuScenes Tracking dataset.
arXiv Detail & Related papers (2020-12-26T15:00:54Z) - Simultaneous Detection and Tracking with Motion Modelling for Multiple
Object Tracking [94.24393546459424]
We introduce Deep Motion Modeling Network (DMM-Net) that can estimate multiple objects' motion parameters to perform joint detection and association.
DMM-Net achieves PR-MOTA score of 12.80 @ 120+ fps for the popular UA-DETRAC challenge, which is better performance and orders of magnitude faster.
We also contribute a synthetic large-scale public dataset Omni-MOT for vehicle tracking that provides precise ground-truth annotations.
arXiv Detail & Related papers (2020-08-20T08:05:33Z) - Dense Scene Multiple Object Tracking with Box-Plane Matching [73.54369833671772]
Multiple Object Tracking (MOT) is an important task in computer vision.
We propose the Box-Plane Matching (BPM) method to improve the MOT performacne in dense scenes.
With the effectiveness of the three modules, our team achieves the 1st place on the Track-1 leaderboard in the ACM MM Grand Challenge HiEve 2020.
arXiv Detail & Related papers (2020-07-30T16:39:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.