P2-Net: Joint Description and Detection of Local Features for Pixel and
Point Matching
- URL: http://arxiv.org/abs/2103.01055v1
- Date: Mon, 1 Mar 2021 14:59:40 GMT
- Title: P2-Net: Joint Description and Detection of Local Features for Pixel and
Point Matching
- Authors: Bing Wang, Changhao Chen, Zhaopeng Cui, Jie Qin, Chris Xiaoxuan Lu,
Zhengdi Yu, Peijun Zhao, Zhen Dong, Fan Zhu, Niki Trigoni, Andrew Markham
- Abstract summary: This work takes the initiative to establish fine-grained correspondences between 2D images and 3D point clouds.
An ultra-wide reception mechanism in combination with a novel loss function are designed to mitigate the intrinsic information variations between pixel and point local regions.
- Score: 78.18641868402901
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurately describing and detecting 2D and 3D keypoints is crucial to
establishing correspondences across images and point clouds. Despite a plethora
of learning-based 2D or 3D local feature descriptors and detectors having been
proposed, the derivation of a shared descriptor and joint keypoint detector
that directly matches pixels and points remains under-explored by the
community. This work takes the initiative to establish fine-grained
correspondences between 2D images and 3D point clouds. In order to directly
match pixels and points, a dual fully convolutional framework is presented that
maps 2D and 3D inputs into a shared latent representation space to
simultaneously describe and detect keypoints. Furthermore, an ultra-wide
reception mechanism in combination with a novel loss function are designed to
mitigate the intrinsic information variations between pixel and point local
regions. Extensive experimental results demonstrate that our framework shows
competitive performance in fine-grained matching between images and point
clouds and achieves state-of-the-art results for the task of indoor visual
localization. Our source code will be available at [no-name-for-blind-review].
Related papers
- Learning to Produce Semi-dense Correspondences for Visual Localization [11.415451542216559]
This study addresses the challenge of performing visual localization in demanding conditions such as night-time scenarios, adverse weather, and seasonal changes.
We propose a novel method that extracts reliable semi-dense 2D-3D matching points based on dense keypoint matches.
The network utilizes both geometric and visual cues to effectively infer 3D coordinates for unobserved keypoints from the observed ones.
arXiv Detail & Related papers (2024-02-13T10:40:10Z) - Differentiable Registration of Images and LiDAR Point Clouds with
VoxelPoint-to-Pixel Matching [58.10418136917358]
Cross-modality registration between 2D images from cameras and 3D point clouds from LiDARs is a crucial task in computer vision and robotic training.
Previous methods estimate 2D-3D correspondences by matching point and pixel patterns learned by neural networks.
We learn a structured cross-modality matching solver to represent 3D features via a different latent pixel space.
arXiv Detail & Related papers (2023-12-07T05:46:10Z) - EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale
Visual Localization [44.05930316729542]
We propose EP2P-Loc, a novel large-scale visual localization method for 3D point clouds.
To increase the number of inliers, we propose a simple algorithm to remove invisible 3D points in the image.
For the first time in this task, we employ a differentiable for end-to-end training.
arXiv Detail & Related papers (2023-09-14T07:06:36Z) - Bridged Transformer for Vision and Point Cloud 3D Object Detection [92.86856146086316]
Bridged Transformer (BrT) is an end-to-end architecture for 3D object detection.
BrT learns to identify 3D and 2D object bounding boxes from both points and image patches.
We experimentally show that BrT surpasses state-of-the-art methods on SUN RGB-D and ScanNetV2 datasets.
arXiv Detail & Related papers (2022-10-04T05:44:22Z) - CorrI2P: Deep Image-to-Point Cloud Registration via Dense Correspondence [51.91791056908387]
We propose the first feature-based dense correspondence framework for addressing the image-to-point cloud registration problem, dubbed CorrI2P.
Specifically, given a pair of a 2D image before a 3D point cloud, we first transform them into high-dimensional feature space feed the features into a symmetric overlapping region to determine the region where the image point cloud overlap.
arXiv Detail & Related papers (2022-07-12T11:49:31Z) - Cross-Modality 3D Object Detection [63.29935886648709]
We present a novel two-stage multi-modal fusion network for 3D object detection.
The whole architecture facilitates two-stage fusion.
Our experiments on the KITTI dataset show that the proposed multi-stage fusion helps the network to learn better representations.
arXiv Detail & Related papers (2020-08-16T11:01:20Z) - EPNet: Enhancing Point Features with Image Semantics for 3D Object
Detection [60.097873683615695]
We aim at addressing two critical issues in the 3D detection task, including the exploitation of multiple sensors.
We propose a novel fusion module to enhance the point features with semantic image features in a point-wise manner without any image annotations.
We design an end-to-end learnable framework named EPNet to integrate these two components.
arXiv Detail & Related papers (2020-07-17T09:33:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.