MGCA-Net: Multi-Graph Contextual Attention Network for Two-View Correspondence Learning
- URL: http://arxiv.org/abs/2512.23369v1
- Date: Mon, 29 Dec 2025 10:58:40 GMT
- Title: MGCA-Net: Multi-Graph Contextual Attention Network for Two-View Correspondence Learning
- Authors: Shuyuan Lin, Mengtin Lo, Haosheng Chen, Yanjie Liang, Qiangqiang Wu,
- Abstract summary: Two-view correspondence learning is a key task in computer vision.<n>We propose a Multi-Graph Contextual Attention Network (MGCA-Net)<n>We show that MGCA-Net significantly outperforms existing SOTA methods in the outlier rejection and camera pose estimation tasks.
- Score: 18.21720738180151
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Two-view correspondence learning is a key task in computer vision, which aims to establish reliable matching relationships for applications such as camera pose estimation and 3D reconstruction. However, existing methods have limitations in local geometric modeling and cross-stage information optimization, which make it difficult to accurately capture the geometric constraints of matched pairs and thus reduce the robustness of the model. To address these challenges, we propose a Multi-Graph Contextual Attention Network (MGCA-Net), which consists of a Contextual Geometric Attention (CGA) module and a Cross-Stage Multi-Graph Consensus (CSMGC) module. Specifically, CGA dynamically integrates spatial position and feature information via an adaptive attention mechanism and enhances the capability to capture both local and global geometric relationships. Meanwhile, CSMGC establishes geometric consensus via a cross-stage sparse graph network, ensuring the consistency of geometric information across different stages. Experimental results on two representative YFCC100M and SUN3D datasets show that MGCA-Net significantly outperforms existing SOTA methods in the outlier rejection and camera pose estimation tasks. Source code is available at http://www.linshuyuan.com.
Related papers
- Multivariate Time Series Forecasting with Hybrid Euclidean-SPD Manifold Graph Neural Networks [31.893767537160258]
We propose a graph neural network-based model that captures data geometry within a hybridean-Riemannian framework.<n>HSMGNN achieves up to a 13.8 percent improvement over state-of-the-art baselines in forecasting accuracy.
arXiv Detail & Related papers (2025-12-16T02:42:03Z) - GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection [1.1106255381859969]
We propose GraphFusion3D, a unified framework combining multi-modal fusion with advanced feature learning.<n>Our approach introduces the Adaptive Cross-Modal Transformer (ACMT), which adaptively integrates image features into point representations to enrich both geometric and semantic information.<n>For proposal refinement, we introduce the Graph Reasoning Module (GRM), a novel mechanism that models neighborhood relationships to simultaneously capture local geometric structures and global semantic context.
arXiv Detail & Related papers (2025-12-02T18:05:02Z) - 3D-Aware Multi-Task Learning with Cross-View Correlations for Dense Scene Understanding [18.76513756741288]
Current approaches mainly capture cross-task relations in the 2D image space, often leading to unstructured features lacking 3D-awareness.<n>We propose to address this problem by integrating correlations across views, i.e., cost volume, as geometric consistency in the MTL network.<n> Specifically, we introduce a lightweight Cross-view Module (CvM), shared across tasks, to exchange information across views and capture cross-view correlations.
arXiv Detail & Related papers (2025-11-25T18:59:34Z) - Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph [45.115555973941255]
Relative pose estimation is crucial for various computer vision applications, including Robotic and Autonomous Driving.
We propose a Geometric Correspondence Graph neural network that integrates point features with extra structured line segments.
This integration of matched points and line segments further exploits the geometry constraints and enhances model performance across different environments.
arXiv Detail & Related papers (2024-08-28T12:33:26Z) - Global-and-Local Collaborative Learning for Co-Salient Object Detection [162.62642867056385]
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images.
We propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM)
The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images)
arXiv Detail & Related papers (2022-04-19T14:32:41Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation.
And we propose three task-specific graph neural networks for effective message passing.
Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z) - Similarity-Aware Fusion Network for 3D Semantic Segmentation [87.51314162700315]
We propose a similarity-aware fusion network (SAFNet) to adaptively fuse 2D images and 3D point clouds for 3D semantic segmentation.
We employ a late fusion strategy where we first learn the geometric and contextual similarities between the input and back-projected (from 2D pixels) point clouds.
We show that SAFNet significantly outperforms existing state-of-the-art fusion-based approaches across various data integrity.
arXiv Detail & Related papers (2021-07-04T09:28:18Z) - Self-supervised Geometric Perception [96.89966337518854]
Self-supervised geometric perception is a framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels.
We show that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
arXiv Detail & Related papers (2021-03-04T15:34:43Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.