CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object
Tracking
- URL: http://arxiv.org/abs/2107.05150v1
- Date: Sun, 11 Jul 2021 23:56:53 GMT
- Title: CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object
Tracking
- Authors: Ramin Nabati, Landon Harris, Hairong Qi
- Abstract summary: We propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion.
Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association.
We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark.
- Score: 9.62721286522053
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D multi-object tracking is a crucial component in the perception system of
autonomous driving vehicles. Tracking all dynamic objects around the vehicle is
essential for tasks such as obstacle avoidance and path planning. Autonomous
vehicles are usually equipped with different sensor modalities to improve
accuracy and reliability. While sensor fusion has been widely used in object
detection networks in recent years, most existing multi-object tracking
algorithms either rely on a single input modality, or do not fully exploit the
information provided by multiple sensing modalities. In this work, we propose
an end-to-end network for joint object detection and tracking based on radar
and camera sensor fusion. Our proposed method uses a center-based radar-camera
fusion algorithm for object detection and utilizes a greedy algorithm for
object association. The proposed greedy algorithm uses the depth, velocity and
2D displacement of the detected objects to associate them through time. This
makes our tracking algorithm very robust to occluded and overlapping objects,
as the depth and velocity information can help the network in distinguishing
them. We evaluate our method on the challenging nuScenes dataset, where it
achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the
benchmark, as well as the baseline LiDAR-based method. Our method is online
with a runtime of 35ms per image, making it very suitable for autonomous
driving applications.
Related papers
- Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System [0.0]
We propose a novel approach to address the problem of camera and radar sensor fusion for 3D object detection in autonomous vehicle perception systems.
Our approach builds on recent advances in deep learning and leverages the strengths of both sensors to improve object detection performance.
Our results show that the proposed approach achieves superior performance over single-sensor solutions and could directly compete with other top-level fusion methods.
arXiv Detail & Related papers (2024-04-25T12:04:31Z) - LISO: Lidar-only Self-Supervised 3D Object Detection [25.420879730860936]
We introduce a novel self-supervised method to train SOTA lidar object detection networks.
It works on unlabeled sequences of lidar point clouds only.
It utilizes a SOTA self-supervised lidar scene flow network under the hood to generate, track, and iteratively refine pseudo ground truth.
arXiv Detail & Related papers (2024-03-11T18:02:52Z) - Multi-Object Tracking with Camera-LiDAR Fusion for Autonomous Driving [0.764971671709743]
The proposed MOT algorithm comprises a three-step association process, an Extended Kalman filter for estimating the motion of each detected dynamic obstacle, and a track management phase.
Unlike most state-of-the-art multi-modal MOT approaches, the proposed algorithm does not rely on maps or knowledge of the ego global pose.
The algorithm is validated both in simulation and with real-world data, with satisfactory results.
arXiv Detail & Related papers (2024-03-06T23:49:16Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - DeepFusionMOT: A 3D Multi-Object Tracking Framework Based on
Camera-LiDAR Fusion with Deep Association [8.34219107351442]
This paper proposes a robust camera-LiDAR fusion-based MOT method that achieves a good trade-off between accuracy and speed.
Our proposed method presents obvious advantages over the state-of-the-art MOT methods in terms of both tracking accuracy and processing speed.
arXiv Detail & Related papers (2022-02-24T13:36:29Z) - High-level camera-LiDAR fusion for 3D object detection with machine
learning [0.0]
This paper tackles the 3D object detection problem, which is of vital importance for applications such as autonomous driving.
It uses a Machine Learning pipeline on a combination of monocular camera and LiDAR data to detect vehicles in the surrounding 3D space of a moving platform.
Our results demonstrate an efficient and accurate inference on a validation set, achieving an overall accuracy of 87.1%.
arXiv Detail & Related papers (2021-05-24T01:57:34Z) - EagerMOT: 3D Multi-Object Tracking via Sensor Fusion [68.8204255655161]
Multi-object tracking (MOT) enables mobile robots to perform well-informed motion planning and navigation by localizing surrounding objects in 3D space and time.
Existing methods rely on depth sensors (e.g., LiDAR) to detect and track targets in 3D space, but only up to a limited sensing range due to the sparsity of the signal.
We propose EagerMOT, a simple tracking formulation that integrates all available object observations from both sensor modalities to obtain a well-informed interpretation of the scene dynamics.
arXiv Detail & Related papers (2021-04-29T22:30:29Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Towards Autonomous Driving: a Multi-Modal 360$^{\circ}$ Perception
Proposal [87.11988786121447]
This paper presents a framework for 3D object detection and tracking for autonomous vehicles.
The solution, based on a novel sensor fusion configuration, provides accurate and reliable road environment detection.
A variety of tests of the system, deployed in an autonomous vehicle, have successfully assessed the suitability of the proposed perception stack.
arXiv Detail & Related papers (2020-08-21T20:36:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.