I2P-Rec: Recognizing Images on Large-scale Point Cloud Maps through
Bird's Eye View Projections
- URL: http://arxiv.org/abs/2303.01043v2
- Date: Tue, 15 Aug 2023 03:53:36 GMT
- Title: I2P-Rec: Recognizing Images on Large-scale Point Cloud Maps through
Bird's Eye View Projections
- Authors: Shuhang Zheng, Yixuan Li, Zhu Yu, Beinan Yu, Si-Yuan Cao, Minhang
Wang, Jintao Xu, Rui Ai, Weihao Gu, Lun Luo, Hui-Liang Shen
- Abstract summary: Place recognition is an important technique for autonomous cars to achieve full autonomy.
We propose the I2P-Rec method to solve the problem by transforming the cross-modal data into the same modality.
With only a small set of training data, I2P-Rec achieves recall rates at Top-1% over 80% and 90%, when localizing monocular and stereo images on point cloud maps.
- Score: 18.7557037030769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Place recognition is an important technique for autonomous cars to achieve
full autonomy since it can provide an initial guess to online localization
algorithms. Although current methods based on images or point clouds have
achieved satisfactory performance, localizing the images on a large-scale point
cloud map remains a fairly unexplored problem. This cross-modal matching task
is challenging due to the difficulty in extracting consistent descriptors from
images and point clouds. In this paper, we propose the I2P-Rec method to solve
the problem by transforming the cross-modal data into the same modality.
Specifically, we leverage on the recent success of depth estimation networks to
recover point clouds from images. We then project the point clouds into Bird's
Eye View (BEV) images. Using the BEV image as an intermediate representation,
we extract global features with a Convolutional Neural Network followed by a
NetVLAD layer to perform matching. The experimental results evaluated on the
KITTI dataset show that, with only a small set of training data, I2P-Rec
achieves recall rates at Top-1\% over 80\% and 90\%, when localizing monocular
and stereo images on point cloud maps, respectively. We further evaluate
I2P-Rec on a 1 km trajectory dataset collected by an autonomous logistics car
and show that I2P-Rec can generalize well to previously unseen environments.
Related papers
- PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training [90.06520673092702]
We present PointRegGPT, boosting 3D point cloud registration using generative point-cloud pairs for training.
To our knowledge, this is the first generative approach that explores realistic data generation for indoor point cloud registration.
arXiv Detail & Related papers (2024-07-19T06:29:57Z) - ModaLink: Unifying Modalities for Efficient Image-to-PointCloud Place Recognition [16.799067323119644]
We introduce a fast and lightweight framework to encode images and point clouds into place-distinctive descriptors.
We propose an effective Field of View (FoV) transformation module to convert point clouds into an analogous modality as images.
We also design a non-negative factorization-based encoder to extract mutually consistent semantic features between point clouds and images.
arXiv Detail & Related papers (2024-03-27T17:01:10Z) - HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation [106.09886920774002]
We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network.
Our method achieves consistent improvements over the baseline trained from scratch and significantly out- performs the existing schemes.
arXiv Detail & Related papers (2024-03-18T14:18:08Z) - Point Cloud Pre-training with Diffusion Models [62.12279263217138]
We propose a novel pre-training method called Point cloud Diffusion pre-training (PointDif)
PointDif achieves substantial improvement across various real-world datasets for diverse downstream tasks such as classification, segmentation and detection.
arXiv Detail & Related papers (2023-11-25T08:10:05Z) - CoFiI2P: Coarse-to-Fine Correspondences for Image-to-Point Cloud Registration [9.57539651520755]
CoFiI2P is a novel I2P registration network that extracts correspondences in a coarse-to-fine manner.
In the coarse matching phase, a novel I2P transformer module is employed to capture both homogeneous and heterogeneous global information.
In the fine matching module, point/pixel pairs are established with the guidance of super-point/super-pixel correspondences.
arXiv Detail & Related papers (2023-09-26T04:32:38Z) - Object Re-Identification from Point Clouds [3.6308236424346694]
We provide the first large-scale study of object ReID from point clouds and establish its performance relative to image ReID.
To our knowledge, we are the first to study object re-identification from real point cloud observations.
arXiv Detail & Related papers (2023-05-17T13:43:03Z) - Point2Vec for Self-Supervised Representation Learning on Point Clouds [66.53955515020053]
We extend data2vec to the point cloud domain and report encouraging results on several downstream tasks.
We propose point2vec, which unleashes the full potential of data2vec-like pre-training on point clouds.
arXiv Detail & Related papers (2023-03-29T10:08:29Z) - BEVPlace: Learning LiDAR-based Place Recognition using Bird's Eye View
Images [20.30997801125592]
We explore the potential of a different representation in place recognition, i.e. bird's eye view (BEV) images.
A simple VGGNet trained on BEV images achieves comparable performance with the state-of-the-art place recognition methods in scenes of slight viewpoint changes.
We develop a method to estimate the position of the query cloud, extending the usage of place recognition.
arXiv Detail & Related papers (2023-02-28T05:37:45Z) - SeqNetVLAD vs PointNetVLAD: Image Sequence vs 3D Point Clouds for
Day-Night Place Recognition [31.714928102950594]
Place Recognition is a crucial capability for mobile robot localization and navigation.
Recent VPR methods based on sequential representations'' have shown promising results.
We compare a 3D point cloud based method with image sequence based methods.
arXiv Detail & Related papers (2021-06-22T02:05:32Z) - DeepI2P: Image-to-Point Cloud Registration via Deep Classification [71.3121124994105]
DeepI2P is a novel approach for cross-modality registration between an image and a point cloud.
Our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar.
We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem.
arXiv Detail & Related papers (2021-04-08T04:27:32Z) - SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine
Reconstruction with Self-Projection Optimization [52.20602782690776]
It is expensive and tedious to obtain large scale paired sparse-canned point sets for training from real scanned sparse data.
We propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface.
We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods.
arXiv Detail & Related papers (2020-12-08T14:14:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.