End-to-end 2D-3D Registration between Image and LiDAR Point Cloud for
Vehicle Localization
- URL: http://arxiv.org/abs/2306.11346v1
- Date: Tue, 20 Jun 2023 07:28:40 GMT
- Title: End-to-end 2D-3D Registration between Image and LiDAR Point Cloud for
Vehicle Localization
- Authors: Guangming Wang, Yu Zheng, Yanfeng Guo, Zhe Liu, Yixiang Zhu, Wolfram
Burgard, and Hesheng Wang
- Abstract summary: We present I2PNet, a novel end-to-end 2D-3D registration network.
I2PNet directly registers the raw 3D point cloud with the 2D RGB image using differential modules with a unique target.
We conduct extensive localization experiments on the KITTI Odometry and nuScenes datasets.
- Score: 45.81385500855306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robot localization using a previously built map is essential for a variety of
tasks including highly accurate navigation and mobile manipulation. A popular
approach to robot localization is based on image-to-point cloud registration,
which combines illumination-invariant LiDAR-based mapping with economical
image-based localization. However, the recent works for image-to-point cloud
registration either divide the registration into separate modules or project
the point cloud to the depth image to register the RGB and depth images. In
this paper, we present I2PNet, a novel end-to-end 2D-3D registration network.
I2PNet directly registers the raw 3D point cloud with the 2D RGB image using
differential modules with a unique target. The 2D-3D cost volume module for
differential 2D-3D association is proposed to bridge feature extraction and
pose regression. 2D-3D cost volume module implicitly constructs the soft
point-to-pixel correspondence on the intrinsic-independent normalized plane of
the pinhole camera model. Moreover, we introduce an outlier mask prediction
module to filter the outliers in the 2D-3D association before pose regression.
Furthermore, we propose the coarse-to-fine 2D-3D registration architecture to
increase localization accuracy. We conduct extensive localization experiments
on the KITTI Odometry and nuScenes datasets. The results demonstrate that
I2PNet outperforms the state-of-the-art by a large margin. In addition, I2PNet
has a higher efficiency than the previous works and can perform the
localization in real-time. Moreover, we extend the application of I2PNet to the
camera-LiDAR online calibration and demonstrate that I2PNet outperforms recent
approaches on the online calibration task.
Related papers
- EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale
Visual Localization [44.05930316729542]
We propose EP2P-Loc, a novel large-scale visual localization method for 3D point clouds.
To increase the number of inliers, we propose a simple algorithm to remove invisible 3D points in the image.
For the first time in this task, we employ a differentiable for end-to-end training.
arXiv Detail & Related papers (2023-09-14T07:06:36Z) - CorrI2P: Deep Image-to-Point Cloud Registration via Dense Correspondence [51.91791056908387]
We propose the first feature-based dense correspondence framework for addressing the image-to-point cloud registration problem, dubbed CorrI2P.
Specifically, given a pair of a 2D image before a 3D point cloud, we first transform them into high-dimensional feature space feed the features into a symmetric overlapping region to determine the region where the image point cloud overlap.
arXiv Detail & Related papers (2022-07-12T11:49:31Z) - Multi-Modality Task Cascade for 3D Object Detection [22.131228757850373]
Many methods train two models in isolation and use simple feature concatenation to represent 3D sensor data.
We propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions.
We show that including a 2D network between two stages of 3D modules significantly improves both 2D and 3D task performance.
arXiv Detail & Related papers (2021-07-08T17:55:01Z) - 3D-to-2D Distillation for Indoor Scene Parsing [78.36781565047656]
We present a new approach that enables us to leverage 3D features extracted from large-scale 3D data repository to enhance 2D features extracted from RGB images.
First, we distill 3D knowledge from a pretrained 3D network to supervise a 2D network to learn simulated 3D features from 2D features during the training.
Second, we design a two-stage dimension normalization scheme to calibrate the 2D and 3D features for better integration.
Third, we design a semantic-aware adversarial training model to extend our framework for training with unpaired 3D data.
arXiv Detail & Related papers (2021-04-06T02:22:24Z) - ParaNet: Deep Regular Representation for 3D Point Clouds [62.81379889095186]
ParaNet is a novel end-to-end deep learning framework for representing 3D point clouds.
It converts an irregular 3D point cloud into a regular 2D color image, named point geometry image (PGI)
In contrast to conventional regular representation modalities based on multi-view projection and voxelization, the proposed representation is differentiable and reversible.
arXiv Detail & Related papers (2020-12-05T13:19:55Z) - End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [62.34374949726333]
Pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras.
PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs.
We introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end.
arXiv Detail & Related papers (2020-04-07T02:18:38Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.