GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for
Multi-Modal 3D Object Detection
- URL: http://arxiv.org/abs/2310.08261v1
- Date: Thu, 12 Oct 2023 12:06:31 GMT
- Title: GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for
Multi-Modal 3D Object Detection
- Authors: Ziying Song, Haiyue Wei, Lin Bai, Lei Yang, Caiyan Jia
- Abstract summary: LiDAR and cameras are complementary sensors for 3D object detection in autonomous driving.
We present GraphAlign, a more accurate feature alignment strategy for 3D object detection by graph matching.
- Score: 7.743525134435137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LiDAR and cameras are complementary sensors for 3D object detection in
autonomous driving. However, it is challenging to explore the unnatural
interaction between point clouds and images, and the critical factor is how to
conduct feature alignment of heterogeneous modalities. Currently, many methods
achieve feature alignment by projection calibration only, without considering
the problem of coordinate conversion accuracy errors between sensors, leading
to sub-optimal performance. In this paper, we present GraphAlign, a more
accurate feature alignment strategy for 3D object detection by graph matching.
Specifically, we fuse image features from a semantic segmentation encoder in
the image branch and point cloud features from a 3D Sparse CNN in the LiDAR
branch. To save computation, we construct the nearest neighbor relationship by
calculating Euclidean distance within the subspaces that are divided into the
point cloud features. Through the projection calibration between the image and
point cloud, we project the nearest neighbors of point cloud features onto the
image features. Then by matching the nearest neighbors with a single point
cloud to multiple images, we search for a more appropriate feature alignment.
In addition, we provide a self-attention module to enhance the weights of
significant relations to fine-tune the feature alignment between heterogeneous
modalities. Extensive experiments on nuScenes benchmark demonstrate the
effectiveness and efficiency of our GraphAlign.
Related papers
- SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks [14.548198408544032]
We treat 3D scene graph alignment as a partial graph-matching problem and propose to solve it with a graph neural network.
We reuse the geometric features learned by a point cloud registration method and associate the clustered point-level geometric features with the node-level semantic feature.
We propose a point-matching rescoring method, that uses the node-wise alignment of the 3D scene graph to reweight the matching candidates from a pre-trained point cloud registration method.
arXiv Detail & Related papers (2024-03-28T15:01:58Z) - Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration [107.61458720202984]
This paper introduces a novel self-supervised learning framework for enhancing 3D perception in autonomous driving scenes.
We propose the learnable transformation alignment to bridge the domain gap between image and point cloud data.
We establish dense 2D-3D correspondences to estimate the rigid pose.
arXiv Detail & Related papers (2024-01-23T02:41:06Z) - LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and
Semantic-Aware Alignment [63.83894701779067]
We propose LCPS, the first LiDAR-Camera Panoptic network.
In our approach, we conduct LiDAR-Camera fusion in three stages.
Our fusion strategy improves about 6.9% PQ performance over the LiDAR-only baseline on NuScenes dataset.
arXiv Detail & Related papers (2023-08-03T10:57:58Z) - LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D
Signals [9.201550006194994]
Learnable matchers often underperform when there exists only small regions of co-visibility between image pairs.
We propose LFM-3D, a Learnable Feature Matching framework that uses models based on graph neural networks.
We show that the resulting improved correspondences lead to much higher relative posing accuracy for in-the-wild image pairs.
arXiv Detail & Related papers (2023-03-22T17:46:27Z) - Differentiable Uncalibrated Imaging [25.67247660827913]
We propose a differentiable imaging framework to address uncertainty in measurement coordinates such as sensor locations and projection angles.
We apply implicit neural networks, also known as neural fields, which are naturally differentiable with respect to the input coordinates.
Differentiability is key as it allows us to jointly fit a measurement representation, optimize over the uncertain measurement coordinates, and perform image reconstruction which in turn ensures consistent calibration.
arXiv Detail & Related papers (2022-11-18T22:48:09Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - CorrI2P: Deep Image-to-Point Cloud Registration via Dense Correspondence [51.91791056908387]
We propose the first feature-based dense correspondence framework for addressing the image-to-point cloud registration problem, dubbed CorrI2P.
Specifically, given a pair of a 2D image before a 3D point cloud, we first transform them into high-dimensional feature space feed the features into a symmetric overlapping region to determine the region where the image point cloud overlap.
arXiv Detail & Related papers (2022-07-12T11:49:31Z) - Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data [80.14669385741202]
We propose a self-supervised pre-training method for 3D perception models tailored to autonomous driving data.
We leverage the availability of synchronized and calibrated image and Lidar sensors in autonomous driving setups.
Our method does not require any point cloud nor image annotations.
arXiv Detail & Related papers (2022-03-30T12:40:30Z) - Angle Based Feature Learning in GNN for 3D Object Detection using Point
Cloud [4.3012765978447565]
We present new feature encoding methods for Detection of 3D objects in point clouds.
We used a graph neural network (GNN) for Detection of 3D objects namely cars, pedestrians, and cyclists.
arXiv Detail & Related papers (2021-08-02T10:56:02Z) - DeepI2P: Image-to-Point Cloud Registration via Deep Classification [71.3121124994105]
DeepI2P is a novel approach for cross-modality registration between an image and a point cloud.
Our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar.
We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem.
arXiv Detail & Related papers (2021-04-08T04:27:32Z) - RoIFusion: 3D Object Detection from LiDAR and Vision [7.878027048763662]
We propose a novel fusion algorithm by projecting a set of 3D Region of Interests (RoIs) from the point clouds to the 2D RoIs of the corresponding the images.
Our approach achieves state-of-the-art performance on the KITTI 3D object detection challenging benchmark.
arXiv Detail & Related papers (2020-09-09T20:23:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.