VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking
- URL: http://arxiv.org/abs/2408.02263v1
- Date: Mon, 5 Aug 2024 06:38:43 GMT
- Title: VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking
- Authors: Yuxuan Lu, Jiahao Nie, Zhiwei He, Hongjie Gu, Xudong Lv,
- Abstract summary: Current LiDAR point cloud-based 3D single object tracking (SOT) methods typically rely on point-based representation network.
We introduce a novel tracking framework, termed VoxelTrack.
By voxelizing inherently disordered point clouds into 3D voxels, VoxelTrack effectively models precise and robust 3D spatial information.
- Score: 3.517993407670811
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current LiDAR point cloud-based 3D single object tracking (SOT) methods typically rely on point-based representation network. Despite demonstrated success, such networks suffer from some fundamental problems: 1) It contains pooling operation to cope with inherently disordered point clouds, hindering the capture of 3D spatial information that is useful for tracking, a regression task. 2) The adopted set abstraction operation hardly handles density-inconsistent point clouds, also preventing 3D spatial information from being modeled. To solve these problems, we introduce a novel tracking framework, termed VoxelTrack. By voxelizing inherently disordered point clouds into 3D voxels and extracting their features via sparse convolution blocks, VoxelTrack effectively models precise and robust 3D spatial information, thereby guiding accurate position prediction for tracked objects. Moreover, VoxelTrack incorporates a dual-stream encoder with cross-iterative feature fusion module to further explore fine-grained 3D spatial information for tracking. Benefiting from accurate 3D spatial information being modeled, our VoxelTrack simplifies tracking pipeline with a single regression loss. Extensive experiments are conducted on three widely-adopted datasets including KITTI, NuScenes and Waymo Open Dataset. The experimental results confirm that VoxelTrack achieves state-of-the-art performance (88.3%, 71.4% and 63.6% mean precision on the three datasets, respectively), and outperforms the existing trackers with a real-time speed of 36 Fps on a single TITAN RTX GPU. The source code and model will be released.
Related papers
- VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking [78.25819070166351]
We propose VoxelNext for fully sparse 3D object detection.
Our core insight is to predict objects directly based on sparse voxel features, without relying on hand-crafted proxies.
Our strong sparse convolutional network VoxelNeXt detects and tracks 3D objects through voxel features entirely.
arXiv Detail & Related papers (2023-03-20T17:40:44Z) - CXTrack: Improving 3D Point Cloud Tracking with Contextual Information [59.55870742072618]
3D single object tracking plays an essential role in many applications, such as autonomous driving.
We propose CXTrack, a novel transformer-based network for 3D object tracking.
We show that CXTrack achieves state-of-the-art tracking performance while running at 29 FPS.
arXiv Detail & Related papers (2022-11-12T11:29:01Z) - A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds [50.54083964183614]
It is non-trivial to perform accurate target-specific detection since the point cloud of objects in raw LiDAR scans is usually sparse and incomplete.
We propose DMT, a Detector-free Motion prediction based 3D Tracking network that totally removes the usage of complicated 3D detectors.
arXiv Detail & Related papers (2022-03-08T17:49:07Z) - DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic
Voxelization [0.0]
We propose a novel two-stage framework for the efficient 3D point cloud object detection.
We parse the raw point cloud data directly in the 3D space yet achieve impressive efficiency and accuracy.
We highlight our KITTI 3D object detection dataset with 75 FPS and on Open dataset with 25 FPS inference speed with satisfactory accuracy.
arXiv Detail & Related papers (2021-07-27T10:07:39Z) - PC-DAN: Point Cloud based Deep Affinity Network for 3D Multi-Object
Tracking (Accepted as an extended abstract in JRDB-ACT Workshop at CVPR21) [68.12101204123422]
A point cloud is a dense compilation of spatial data in 3D coordinates.
We propose a PointNet-based approach for 3D Multi-Object Tracking (MOT)
arXiv Detail & Related papers (2021-06-03T05:36:39Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection [99.16162624992424]
We devise a simple but effective voxel-based framework, named Voxel R-CNN.
By taking full advantage of voxel features in a two stage approach, our method achieves comparable detection accuracy with state-of-the-art point-based models.
Our results show that Voxel R-CNN delivers a higher detection accuracy while maintaining a realtime frame processing rate, emphi.e, at a speed of 25 FPS on an NVIDIA 2080 Ti GPU.
arXiv Detail & Related papers (2020-12-31T17:02:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.