CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object
Tracking with Camera-LiDAR Fusion
- URL: http://arxiv.org/abs/2209.02540v2
- Date: Wed, 7 Sep 2022 02:43:11 GMT
- Title: CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object
Tracking with Camera-LiDAR Fusion
- Authors: Li Wang, Xinyu Zhang, Wenyuan Qin, Xiaoyu Li, Lei Yang, Zhiwei Li, Lei
Zhu, Hong Wang, Jun Li, and Huaping Liu
- Abstract summary: 3D Multi-object tracking (MOT) ensures consistency during continuous dynamic detection.
It can be challenging to accurately track the irregular motion of objects for LiDAR-based methods.
We propose a novel camera-LiDAR fusion 3D MOT framework based on the Combined Appearance-Motion Optimization (CAMO-MOT)
- Score: 34.42289908350286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D Multi-object tracking (MOT) ensures consistency during continuous dynamic
detection, conducive to subsequent motion planning and navigation tasks in
autonomous driving. However, camera-based methods suffer in the case of
occlusions and it can be challenging to accurately track the irregular motion
of objects for LiDAR-based methods. Some fusion methods work well but do not
consider the untrustworthy issue of appearance features under occlusion. At the
same time, the false detection problem also significantly affects tracking. As
such, we propose a novel camera-LiDAR fusion 3D MOT framework based on the
Combined Appearance-Motion Optimization (CAMO-MOT), which uses both camera and
LiDAR data and significantly reduces tracking failures caused by occlusion and
false detection. For occlusion problems, we are the first to propose an
occlusion head to select the best object appearance features multiple times
effectively, reducing the influence of occlusions. To decrease the impact of
false detection in tracking, we design a motion cost matrix based on confidence
scores which improve the positioning and object prediction accuracy in 3D
space. As existing multi-object tracking methods only consider a single
category, we also propose to build a multi-category loss to implement
multi-object tracking in multi-category scenes. A series of validation
experiments are conducted on the KITTI and nuScenes tracking benchmarks. Our
proposed method achieves state-of-the-art performance and the lowest identity
switches (IDS) value (23 for Car and 137 for Pedestrian) among all multi-modal
MOT methods on the KITTI test dataset. And our proposed method achieves
state-of-the-art performance among all algorithms on the nuScenes test dataset
with 75.3% AMOTA.
Related papers
- ConsistencyTrack: A Robust Multi-Object Tracker with a Generation Strategy of Consistency Model [20.259334882471574]
Multi-object tracking (MOT) is a critical technology in computer vision, designed to detect multiple targets in video sequences and assign each target a unique ID per frame.
Existed MOT methods excel at accurately tracking multiple objects in real-time across various scenarios.
We propose a novel ConsistencyTrack, joint detection and tracking(JDT) framework that formulates detection and association as a denoising diffusion process on bounding boxes.
arXiv Detail & Related papers (2024-08-28T05:53:30Z) - You Only Need Two Detectors to Achieve Multi-Modal 3D Multi-Object Tracking [9.20064374262956]
The proposed framework can achieve robust tracking by using only a 2D detector and a 3D detector.
It is proven more accurate than many of the state-of-the-art TBD-based multi-modal tracking methods.
arXiv Detail & Related papers (2023-04-18T02:45:18Z) - ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every
Detection Box [81.45219802386444]
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects across video frames.
We propose a hierarchical data association strategy to mine the true objects in low-score detection boxes.
In 3D scenarios, it is much easier for the tracker to predict object velocities in the world coordinate.
arXiv Detail & Related papers (2023-03-27T15:35:21Z) - An Effective Motion-Centric Paradigm for 3D Single Object Tracking in
Point Clouds [50.19288542498838]
3D single object tracking in LiDAR point clouds (LiDAR SOT) plays a crucial role in autonomous driving.
Current approaches all follow the Siamese paradigm based on appearance matching.
We introduce a motion-centric paradigm to handle LiDAR SOT from a new perspective.
arXiv Detail & Related papers (2023-03-21T17:28:44Z) - DetFlowTrack: 3D Multi-object Tracking based on Simultaneous
Optimization of Object Detection and Scene Flow Estimation [23.305159598648924]
We propose a 3D MOT framework based on simultaneous optimization of object detection and scene flow estimation.
For more accurate scene flow label especially in the case of motion with rotation, a box-transformation-based scene flow ground truth calculation method is proposed.
Experimental results on the KITTI MOT dataset show competitive results over the state-of-the-arts and the robustness under extreme motion with rotation.
arXiv Detail & Related papers (2022-03-04T07:06:47Z) - DeepFusionMOT: A 3D Multi-Object Tracking Framework Based on
Camera-LiDAR Fusion with Deep Association [8.34219107351442]
This paper proposes a robust camera-LiDAR fusion-based MOT method that achieves a good trade-off between accuracy and speed.
Our proposed method presents obvious advantages over the state-of-the-art MOT methods in terms of both tracking accuracy and processing speed.
arXiv Detail & Related papers (2022-02-24T13:36:29Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking
from View Aggregation [8.854112907350624]
3D multi-object tracking plays a vital role in autonomous navigation.
Many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space.
We propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames.
arXiv Detail & Related papers (2020-11-25T16:14:40Z) - Tracking Road Users using Constraint Programming [79.32806233778511]
We present a constraint programming (CP) approach for the data association phase found in the tracking-by-detection paradigm of the multiple object tracking (MOT) problem.
Our proposed method was tested on a motorized vehicles tracking dataset and produces results that outperform the top methods of the UA-DETRAC benchmark.
arXiv Detail & Related papers (2020-03-10T00:04:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.