PointMBF: A Multi-scale Bidirectional Fusion Network for Unsupervised
RGB-D Point Cloud Registration
- URL: http://arxiv.org/abs/2308.04782v1
- Date: Wed, 9 Aug 2023 08:13:46 GMT
- Title: PointMBF: A Multi-scale Bidirectional Fusion Network for Unsupervised
RGB-D Point Cloud Registration
- Authors: Mingzhi Yuan, Kexue Fu, Zhihao Li, Yucong Meng, Manning Wang
- Abstract summary: We propose a network implementing multi-scale bidirectional fusion between RGB images and point clouds generated from depth images.
Our method achieves new state-of-the-art performance.
- Score: 6.030097207369754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Point cloud registration is a task to estimate the rigid transformation
between two unaligned scans, which plays an important role in many computer
vision applications. Previous learning-based works commonly focus on supervised
registration, which have limitations in practice. Recently, with the advance of
inexpensive RGB-D sensors, several learning-based works utilize RGB-D data to
achieve unsupervised registration. However, most of existing unsupervised
methods follow a cascaded design or fuse RGB-D data in a unidirectional manner,
which do not fully exploit the complementary information in the RGB-D data. To
leverage the complementary information more effectively, we propose a network
implementing multi-scale bidirectional fusion between RGB images and point
clouds generated from depth images. By bidirectionally fusing visual and
geometric features in multi-scales, more distinctive deep features for
correspondence estimation can be obtained, making our registration more
accurate. Extensive experiments on ScanNet and 3DMatch demonstrate that our
method achieves new state-of-the-art performance. Code will be released at
https://github.com/phdymz/PointMBF
Related papers
- DFormer: Rethinking RGBD Representation Learning for Semantic
Segmentation [76.81628995237058]
DFormer is a novel framework to learn transferable representations for RGB-D segmentation tasks.
It pretrains the backbone using image-depth pairs from ImageNet-1K.
DFormer achieves new state-of-the-art performance on two popular RGB-D tasks.
arXiv Detail & Related papers (2023-09-18T11:09:11Z) - Improving RGB-D Point Cloud Registration by Learning Multi-scale Local
Linear Transformation [38.64501645574878]
Point cloud registration aims at estimating the geometric transformation between two point cloud scans.
Recent point cloud registration methods have tried to apply RGB-D data to achieve more accurate correspondence.
We propose a new Geometry-Aware Visual Feature Extractor (GAVE) that employs multi-scale local linear transformation.
arXiv Detail & Related papers (2022-08-31T14:36:09Z) - Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images.
We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z) - Modality-Guided Subnetwork for Salient Object Detection [5.491692465987937]
Most RGBD networks require multi-modalities from the input side and feed them separately through a two-stream design.
We present in this paper a novel fusion design named modality-guided subnetwork (MGSnet)
It has the following superior designs: 1) Our model works for both RGB and RGBD data, and dynamically estimating depth if not available.
arXiv Detail & Related papers (2021-10-10T20:59:11Z) - RGB-D Saliency Detection via Cascaded Mutual Information Minimization [122.8879596830581]
Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning.
We introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.
arXiv Detail & Related papers (2021-09-15T12:31:27Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z) - Cascade Graph Neural Networks for RGB-D Salient Object Detection [41.57218490671026]
We study the problem of salient object detection (SOD) for RGB-D images using both color and depth information.
We introduce Cascade Graph Neural Networks(Cas-Gnn),a unified framework which is capable of comprehensively distilling and reasoning the mutual benefits between these two data sources.
Cas-Gnn achieves significantly better performance than all existing RGB-DSOD approaches on several widely-used benchmarks.
arXiv Detail & Related papers (2020-08-07T10:59:04Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z) - Is Depth Really Necessary for Salient Object Detection? [50.10888549190576]
We make the first attempt in realizing an unified depth-aware framework with only RGB information as input for inference.
Not only surpasses the state-of-the-art performances on five public RGB SOD benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a large margin.
arXiv Detail & Related papers (2020-05-30T13:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.