3D-SiamRPN: An End-to-End Learning Method for Real-Time 3D Single Object
Tracking Using Raw Point Cloud
- URL: http://arxiv.org/abs/2108.05630v1
- Date: Thu, 12 Aug 2021 09:52:28 GMT
- Title: 3D-SiamRPN: An End-to-End Learning Method for Real-Time 3D Single Object
Tracking Using Raw Point Cloud
- Authors: Zheng Fang, Sifan Zhou, Yubo Cui, Sebastian Scherer
- Abstract summary: We propose a 3D tracking method called 3D-SiamRPN Network to track a single target object by using raw 3D point cloud data.
Experimental results on KITTI dataset show that our method has a competitive performance in both Success and Precision.
- Score: 9.513194898261787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D single object tracking is a key issue for autonomous following robot,
where the robot should robustly track and accurately localize the target for
efficient following. In this paper, we propose a 3D tracking method called
3D-SiamRPN Network to track a single target object by using raw 3D point cloud
data. The proposed network consists of two subnetworks. The first subnetwork is
feature embedding subnetwork which is used for point cloud feature extraction
and fusion. In this subnetwork, we first use PointNet++ to extract features of
point cloud from template and search branches. Then, to fuse the information of
features in the two branches and obtain their similarity, we propose two cross
correlation modules, named Pointcloud-wise and Point-wise respectively. The
second subnetwork is region proposal network(RPN), which is used to get the
final 3D bounding box of the target object based on the fusion feature from
cross correlation modules. In this subnetwork, we utilize the regression and
classification branches of a region proposal subnetwork to obtain proposals and
scores, thus get the final 3D bounding box of the target object. Experimental
results on KITTI dataset show that our method has a competitive performance in
both Success and Precision compared to the state-of-the-art methods, and could
run in real-time at 20.8 FPS. Additionally, experimental results on H3D dataset
demonstrate that our method also has good generalization ability and could
achieve good tracking performance in a new scene without re-training.
Related papers
- PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds [5.524413892353708]
LiDAR-based 3D single object tracking (3D SOT) is a critical issue in robotics and autonomous driving.
We propose PillarTrack, a pillar-based 3D single object tracking framework.
PillarTrack achieves state-of-the-art performance on the KITTI and nuScenes dataset and enables real-time tracking speed.
arXiv Detail & Related papers (2024-04-11T06:06:56Z) - Dynamic Clustering Transformer Network for Point Cloud Segmentation [23.149220817575195]
We propose a novel 3D point cloud representation network, called Dynamic Clustering Transformer Network (DCTNet)
It has an encoder-decoder architecture, allowing for both local and global feature learning.
Our method was evaluated on an object-based dataset (ShapeNet), an urban navigation dataset (Toronto-3D), and a multispectral LiDAR dataset.
arXiv Detail & Related papers (2023-05-30T01:11:05Z) - Unleash the Potential of Image Branch for Cross-modal 3D Object
Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects.
First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation.
Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - RBGNet: Ray-based Grouping for 3D Object Detection [104.98776095895641]
We propose the RBGNet framework, a voting-based 3D detector for accurate 3D object detection from point clouds.
We propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays.
Our model achieves state-of-the-art 3D detection performance on ScanNet V2 and SUN RGB-D with remarkable performance gains.
arXiv Detail & Related papers (2022-04-05T14:42:57Z) - Group-Free 3D Object Detection via Transformers [26.040378025818416]
We present a simple yet effective method for directly detecting 3D objects from the 3D point cloud.
Our method computes the feature of an object from all the points in the point cloud with the help of an attention mechanism in the Transformers citevaswaniattention.
With few bells and whistles, the proposed method achieves state-of-the-art 3D object detection performance on two widely used benchmarks, ScanNet V2 and SUN RGB-D.
arXiv Detail & Related papers (2021-04-01T17:59:36Z) - Spherical Interpolated Convolutional Network with Distance-Feature
Density for 3D Semantic Segmentation of Point Clouds [24.85151376535356]
Spherical interpolated convolution operator is proposed to replace the traditional grid-shaped 3D convolution operator.
The proposed method achieves good performance on the ScanNet dataset and Paris-Lille-3D dataset.
arXiv Detail & Related papers (2020-11-27T15:35:12Z) - Graph Neural Networks for 3D Multi-Object Tracking [28.121708602059048]
3D Multi-object tracking (MOT) is crucial to autonomous systems.
Recent work often uses a tracking-by-detection pipeline.
We propose a novel feature interaction mechanism by introducing Graph Neural Networks.
arXiv Detail & Related papers (2020-08-20T17:55:41Z) - Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection.
The whole architecture facilitates two-stage fusion.
Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z) - PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection [76.30585706811993]
We present a novel and high-performance 3D object detection framework, named PointVoxel-RCNN (PV-RCNN)
Our proposed method deeply integrates both 3D voxel Convolutional Neural Network (CNN) and PointNet-based set abstraction.
It takes advantages of efficient learning and high-quality proposals of the 3D voxel CNN and the flexible receptive fields of the PointNet-based networks.
arXiv Detail & Related papers (2019-12-31T06:34:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.