Related papers: GMF: General Multimodal Fusion Framework for Correspondence Outlier Rejection

GMF: General Multimodal Fusion Framework for Correspondence Outlier Rejection

URL: http://arxiv.org/abs/2211.00207v1
Date: Tue, 1 Nov 2022 01:18:46 GMT
Title: GMF: General Multimodal Fusion Framework for Correspondence Outlier Rejection
Authors: Xiaoshui Huang, Wentao Qu, Yifan Zuo, Yuming Fang, Xiaowei Zhao
Abstract summary: We propose General Multimodal Fusion to learn to reject the correspondence outliers. Our GMF achieves wide generalization ability and consistently improves the point cloud registration accuracy.
Score: 36.35090386001373
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Rejecting correspondence outliers enables to boost the correspondence quality, which is a critical step in achieving high point cloud registration accuracy. The current state-of-the-art correspondence outlier rejection methods only utilize the structure features of the correspondences. However, texture information is critical to reject the correspondence outliers in our human vision system. In this paper, we propose General Multimodal Fusion (GMF) to learn to reject the correspondence outliers by leveraging both the structure and texture information. Specifically, two cross-attention-based fusion layers are proposed to fuse the texture information from paired images and structure information from point correspondences. Moreover, we propose a convolutional position encoding layer to enhance the difference between Tokens and enable the encoding feature pay attention to neighbor information. Our position encoding layer will make the cross-attention operation integrate both local and global information. Experiments on multiple datasets(3DMatch, 3DLoMatch, KITTI) and recent state-of-the-art models (3DRegNet, DGR, PointDSC) prove that our GMF achieves wide generalization ability and consistently improves the point cloud registration accuracy. Furthermore, several ablation studies demonstrate the robustness of the proposed GMF on different loss functions, lighting conditions and noises.The code is available at https://github.com/XiaoshuiHuang/GMF.

Related papers

DAGLFNet:Deep Attention-Guided Global-Local Feature Fusion for Pseudo-Image Point Cloud Segmentation [6.418552842518015]
We propose DAGLFNet, a pseudo-image-based representation method to extract discriminative features from point clouds.<n>The method balances high performance with real-time capability, demonstrating great potential for LiDAR-based real-time applications.
arXiv Detail & Related papers (2025-10-12T06:35:03Z)
CORE-ReID: Comprehensive Optimization and Refinement through Ensemble fusion in Domain Adaptation for person re-identification [0.0]
This study introduces a novel framework, "Comprehensive Optimization and Refinement through Ensemble Fusion in Domain Adaptation for Person Re-identification"<n>The framework utilizes CycleGAN to generate diverse data that harmonizes differences in image characteristics from different camera sources in the pre-training stage.<n>In the fine-tuning stage, based on a pair of teacher-student networks, the framework integrates multi-view features for multi-level clustering to derive diverse pseudo labels.
arXiv Detail & Related papers (2025-08-05T04:25:03Z)
PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network [24.54269823691119]
We present an advanced study on more challenging high-resolution salient object detection (HRSOD) from both dataset and network framework perspectives. To compensate for the lack of HRSOD dataset, we thoughtfully collect a large-scale high resolution salient object detection dataset, called UHRSD. All the images are finely annotated in pixel-level, far exceeding previous low-resolution SOD datasets.
arXiv Detail & Related papers (2024-08-02T09:31:21Z)
A Semantic-Aware and Multi-Guided Network for Infrared-Visible Image Fusion [41.34335755315773]
This paper focuses on how to model correlation-driven decomposing features and reason high-level graph representation.<n>We propose a three-branch encoder-decoder architecture along with corresponding fusion layers as the fusion strategy.<n> Experiments demonstrate the competitive results compared with state-of-the-art methods in visible/infrared image fusion and medical image fusion tasks.
arXiv Detail & Related papers (2024-06-11T09:32:40Z)
UGMAE: A Unified Framework for Graph Masked Autoencoders [67.75493040186859]
We propose UGMAE, a unified framework for graph masked autoencoders. We first develop an adaptive feature mask generator to account for the unique significance of nodes. We then design a ranking-based structure reconstruction objective joint with feature reconstruction to capture holistic graph information.
arXiv Detail & Related papers (2024-02-12T19:39:26Z)
Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs. Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z)
Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features. Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z)
Robust Point Cloud Registration Framework Based on Deep Graph Matching(TPAMI Version) [13.286247750893681]
3D point cloud registration is a fundamental problem in computer vision and robotics. We propose a novel deep graph matching-based framework for point cloud registration.
arXiv Detail & Related papers (2022-11-09T06:05:25Z)
Camouflaged Object Detection via Context-aware Cross-level Fusion [10.942917945534678]
Camouflaged object detection (COD) aims to identify the objects that conceal themselves in natural scenes. We propose a novel Context-aware Cross-level Fusion Network (C2F-Net), which fuses context-aware cross-level features. C2F-Net is an effective COD model and outperforms state-of-the-art (SOTA) models remarkably.
arXiv Detail & Related papers (2022-07-27T08:34:16Z)
Robust Partial-to-Partial Point Cloud Registration in a Full Range [12.86951061306046]
We propose Graph Matching Consensus Network (GMCNet), which estimates pose-invariant correspondences for fullrange 1 Partial-to-Partial point cloud Registration (PPR) GMCNet encodes point descriptors for each point cloud individually without using crosscontextual information, or ground truth correspondences for training.
arXiv Detail & Related papers (2021-11-30T17:56:24Z)
Similarity-Aware Fusion Network for 3D Semantic Segmentation [87.51314162700315]
We propose a similarity-aware fusion network (SAFNet) to adaptively fuse 2D images and 3D point clouds for 3D semantic segmentation. We employ a late fusion strategy where we first learn the geometric and contextual similarities between the input and back-projected (from 2D pixels) point clouds. We show that SAFNet significantly outperforms existing state-of-the-art fusion-based approaches across various data integrity.
arXiv Detail & Related papers (2021-07-04T09:28:18Z)
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment. Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.