GMF: General Multimodal Fusion Framework for Correspondence Outlier
Rejection
- URL: http://arxiv.org/abs/2211.00207v1
- Date: Tue, 1 Nov 2022 01:18:46 GMT
- Title: GMF: General Multimodal Fusion Framework for Correspondence Outlier
Rejection
- Authors: Xiaoshui Huang, Wentao Qu, Yifan Zuo, Yuming Fang, Xiaowei Zhao
- Abstract summary: We propose General Multimodal Fusion to learn to reject the correspondence outliers.
Our GMF achieves wide generalization ability and consistently improves the point cloud registration accuracy.
- Score: 36.35090386001373
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rejecting correspondence outliers enables to boost the correspondence
quality, which is a critical step in achieving high point cloud registration
accuracy. The current state-of-the-art correspondence outlier rejection methods
only utilize the structure features of the correspondences. However, texture
information is critical to reject the correspondence outliers in our human
vision system. In this paper, we propose General Multimodal Fusion (GMF) to
learn to reject the correspondence outliers by leveraging both the structure
and texture information. Specifically, two cross-attention-based fusion layers
are proposed to fuse the texture information from paired images and structure
information from point correspondences. Moreover, we propose a convolutional
position encoding layer to enhance the difference between Tokens and enable the
encoding feature pay attention to neighbor information. Our position encoding
layer will make the cross-attention operation integrate both local and global
information. Experiments on multiple datasets(3DMatch, 3DLoMatch, KITTI) and
recent state-of-the-art models (3DRegNet, DGR, PointDSC) prove that our GMF
achieves wide generalization ability and consistently improves the point cloud
registration accuracy. Furthermore, several ablation studies demonstrate the
robustness of the proposed GMF on different loss functions, lighting conditions
and noises.The code is available at https://github.com/XiaoshuiHuang/GMF.
Related papers
- DAGLFNet:Deep Attention-Guided Global-Local Feature Fusion for Pseudo-Image Point Cloud Segmentation [6.418552842518015]
We propose DAGLFNet, a pseudo-image-based representation method to extract discriminative features from point clouds.<n>The method balances high performance with real-time capability, demonstrating great potential for LiDAR-based real-time applications.
arXiv Detail & Related papers (2025-10-12T06:35:03Z) - CORE-ReID: Comprehensive Optimization and Refinement through Ensemble fusion in Domain Adaptation for person re-identification [0.0]
This study introduces a novel framework, "Comprehensive Optimization and Refinement through Ensemble Fusion in Domain Adaptation for Person Re-identification"<n>The framework utilizes CycleGAN to generate diverse data that harmonizes differences in image characteristics from different camera sources in the pre-training stage.<n>In the fine-tuning stage, based on a pair of teacher-student networks, the framework integrates multi-view features for multi-level clustering to derive diverse pseudo labels.
arXiv Detail & Related papers (2025-08-05T04:25:03Z) - PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network [24.54269823691119]
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives.
To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD.
All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets.
arXiv Detail & Related papers (2024-08-02T09:31:21Z) - A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion [41.34335755315773]
This paper focuses on how to model correlation-driven decomposing features and reason high-level graph representation.<n>We propose a three-branch encoder-decoder architecture along with corresponding fusion layers as the fusion strategy.<n> Experiments demonstrate the competitive results compared with state-of-the-art methods in visible/infrared image fusion and medical image fusion tasks.
arXiv Detail & Related papers (2024-06-11T09:32:40Z) - UGMAE: A Unified Framework for Graph Masked Autoencoders [67.75493040186859]
We propose UGMAE, a unified framework for graph masked autoencoders.
We first develop an adaptive feature mask generator to account for the unique significance of nodes.
We then design a ranking-based structure reconstruction objective joint with feature reconstruction to capture holistic graph information.
arXiv Detail & Related papers (2024-02-12T19:39:26Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - Robust Point Cloud Registration Framework Based on Deep Graph
Matching(TPAMI Version) [13.286247750893681]
3D point cloud registration is a fundamental problem in computer vision and robotics.
We propose a novel deep graph matching-based framework for point cloud registration.
arXiv Detail & Related papers (2022-11-09T06:05:25Z) - Camouflaged Object Detection via Context-aware Cross-level Fusion [10.942917945534678]
Camouflaged object detection (COD) aims to identify the objects that conceal themselves in natural scenes.
We propose a novel Context-aware Cross-level Fusion Network (C2F-Net), which fuses context-aware cross-level features.
C2F-Net is an effective COD model and outperforms state-of-the-art (SOTA) models remarkably.
arXiv Detail & Related papers (2022-07-27T08:34:16Z) - Robust Partial-to-Partial Point Cloud Registration in a Full Range [12.86951061306046]
We propose Graph Matching Consensus Network (GMCNet), which estimates pose-invariant correspondences for fullrange 1 Partial-to-Partial point cloud Registration (PPR)
GMCNet encodes point descriptors for each point cloud individually without using crosscontextual information, or ground truth correspondences for training.
arXiv Detail & Related papers (2021-11-30T17:56:24Z) - Similarity-Aware Fusion Network for 3D Semantic Segmentation [87.51314162700315]
We propose a similarity-aware fusion network (SAFNet) to adaptively fuse 2D images and 3D point clouds for 3D semantic segmentation.
We employ a late fusion strategy where we first learn the geometric and contextual similarities between the input and back-projected (from 2D pixels) point clouds.
We show that SAFNet significantly outperforms existing state-of-the-art fusion-based approaches across various data integrity.
arXiv Detail & Related papers (2021-07-04T09:28:18Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.