Joint 3D Object Detection and Tracking Using Spatio-Temporal
Representation of Camera Image and LiDAR Point Clouds
- URL: http://arxiv.org/abs/2112.07116v2
- Date: Wed, 15 Dec 2021 16:26:41 GMT
- Title: Joint 3D Object Detection and Tracking Using Spatio-Temporal
Representation of Camera Image and LiDAR Point Clouds
- Authors: Junho Koh, Jaekyum Kim, Jinhyuk Yoo, Yecheol Kim, Dongsuk Kum, Jun Won
Choi
- Abstract summary: We propose a new joint object detection and tracking (DT) framework for 3D object detection and tracking based on camera and LiDAR sensors.
The proposed method, referred to as 3D DetecJo, enables the detector and tracker to cooperate to generate atemporal-representation of the camera and LiDAR data.
- Score: 12.334725127696395
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a new joint object detection and tracking (JoDT)
framework for 3D object detection and tracking based on camera and LiDAR
sensors. The proposed method, referred to as 3D DetecTrack, enables the
detector and tracker to cooperate to generate a spatio-temporal representation
of the camera and LiDAR data, with which 3D object detection and tracking are
then performed. The detector constructs the spatio-temporal features via the
weighted temporal aggregation of the spatial features obtained by the camera
and LiDAR fusion. Then, the detector reconfigures the initial detection results
using information from the tracklets maintained up to the previous time step.
Based on the spatio-temporal features generated by the detector, the tracker
associates the detected objects with previously tracked objects using a graph
neural network (GNN). We devise a fully-connected GNN facilitated by a
combination of rule-based edge pruning and attention-based edge gating, which
exploits both spatial and temporal object contexts to improve tracking
performance. The experiments conducted on both KITTI and nuScenes benchmarks
demonstrate that the proposed 3D DetecTrack achieves significant improvements
in both detection and tracking performances over baseline methods and achieves
state-of-the-art performance among existing methods through collaboration
between the detector and tracker.
Related papers
- Run-time Monitoring of 3D Object Detection in Automated Driving Systems Using Early Layer Neural Activation Patterns [12.384452095533396]
Integrity monitoring of automated driving systems (ADS) is paramount for ensuring safety.
Recent advancements in deep neural network (DNN)-based object detectors, their susceptibility to detection errors remains a significant concern.
arXiv Detail & Related papers (2024-04-11T12:24:47Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - D-Align: Dual Query Co-attention Network for 3D Object Detection Based
on Multi-frame Point Cloud Sequence [8.21339007493213]
Conventional 3D object detectors detect objects using a set of points acquired over a fixed duration.
Recent studies have shown that the performance of object detection can be further enhanced by utilizing point cloud sequences.
We propose D-Align, which can effectively produce strong bird's-eye-view (BEV) features by aligning and aggregating the features obtained from a sequence of point sets.
arXiv Detail & Related papers (2022-09-30T20:41:25Z) - DirectTracker: 3D Multi-Object Tracking Using Direct Image Alignment and
Photometric Bundle Adjustment [41.27664827586102]
Direct methods have shown excellent performance in the applications of visual odometry and SLAM.
We propose a framework that effectively combines direct image alignment for the short-term tracking and sliding-window photometric bundle adjustment for 3D object detection.
arXiv Detail & Related papers (2022-09-29T17:40:22Z) - Minkowski Tracker: A Sparse Spatio-Temporal R-CNN for Joint Object
Detection and Tracking [53.64390261936975]
We present Minkowski Tracker, a sparse-temporal R-CNN that jointly solves object detection and tracking problems.
Inspired by region-based CNN (R-CNN), we propose to track motion as a second stage of the object detector R-CNN.
We show in large-scale experiments that the overall performance gain of our method is due to four factors.
arXiv Detail & Related papers (2022-08-22T04:47:40Z) - Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for
Autonomous Driving [3.8073142980733]
We propose jointly training 3D detection and 3D tracking from only monocular videos in an end-to-end manner.
Time3D achieves 21.4% AMOTA, 13.6% AMOTP on the nuScenes 3D tracking benchmark, surpassing all published competitors.
arXiv Detail & Related papers (2022-05-30T06:41:10Z) - A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds [50.54083964183614]
It is non-trivial to perform accurate target-specific detection since the point cloud of objects in raw LiDAR scans is usually sparse and incomplete.
We propose DMT, a Detector-free Motion prediction based 3D Tracking network that totally removes the usage of complicated 3D detectors.
arXiv Detail & Related papers (2022-03-08T17:49:07Z) - 3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature
Correlation [0.0]
3D-FCT is a Siamese network architecture that utilizes temporal information to simultaneously perform the related tasks of 3D object detection and tracking.
Our proposed method is evaluated on the KITTI tracking dataset where it is shown to provide an improvement of 5.57% mAP over a state-of-the-art approach.
arXiv Detail & Related papers (2021-10-06T06:36:29Z) - Track to Detect and Segment: An Online Multi-Object Tracker [81.15608245513208]
TraDeS is an online joint detection and tracking model, exploiting tracking clues to assist detection end-to-end.
TraDeS infers object tracking offset by a cost volume, which is used to propagate previous object features.
arXiv Detail & Related papers (2021-03-16T02:34:06Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.