Related papers: SGAD: Semantic and Geometric-aware Descriptor for Local Feature Matching

SGAD: Semantic and Geometric-aware Descriptor for Local Feature Matching

URL: http://arxiv.org/abs/2508.02278v1
Date: Mon, 04 Aug 2025 10:46:53 GMT
Title: SGAD: Semantic and Geometric-aware Descriptor for Local Feature Matching
Authors: Xiangzeng Liu, Chi Wang, Guanglu Shi, Xiaodong Zhang, Qiguang Miao, Miao Fan,
Abstract summary: We introduce the Semantic and Geometric-aware Descriptor Network (SGAD), which fundamentally rethinks area-based matching.<n>SGAD generates highly discriminative area descriptors that enable direct matching without complex graph optimization.<n>We further improve the performance of area matching through a novel supervision strategy that decomposes the area matching task into classification and ranking subtasks.
Score: 16.683203139962153
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Local feature matching remains a fundamental challenge in computer vision. Recent Area to Point Matching (A2PM) methods have improved matching accuracy. However, existing research based on this framework relies on inefficient pixel-level comparisons and complex graph matching that limit scalability. In this work, we introduce the Semantic and Geometric-aware Descriptor Network (SGAD), which fundamentally rethinks area-based matching by generating highly discriminative area descriptors that enable direct matching without complex graph optimization. This approach significantly improves both accuracy and efficiency of area matching. We further improve the performance of area matching through a novel supervision strategy that decomposes the area matching task into classification and ranking subtasks. Finally, we introduce the Hierarchical Containment Redundancy Filter (HCRF) to eliminate overlapping areas by analyzing containment graphs. SGAD demonstrates remarkable performance gains, reducing runtime by 60x (0.82s vs. 60.23s) compared to MESA. Extensive evaluations show consistent improvements across multiple point matchers: SGAD+LoFTR reduces runtime compared to DKM, while achieving higher accuracy (0.82s vs. 1.51s, 65.98 vs. 61.11) in outdoor pose estimation, and SGAD+ROMA delivers +7.39% AUC@5{\deg} in indoor pose estimation, establishing a new state-of-the-art.

Related papers

Enhancing point cloud analysis via neighbor aggregation correction based on cross-stage structure correlation [22.48120946682699]
Point cloud analysis is a cornerstone of many downstream tasks, among which aggregating local structures is the basis for understanding point cloud data.<n>We propose the Point Distribution Set Abstraction module (PDSA) that utilizes the correlation in the high-dimensional space to correct the feature distribution during aggregation.<n>PDSA distinguishes the point correlation based on a lightweight cross-stage structural descriptor, and enhances structural homogeneity.
arXiv Detail & Related papers (2025-06-18T06:08:17Z)
MESA: Effective Matching Redundancy Reduction by Semantic Area Segmentation [22.288675638485618]
We propose MESA and DMESA as novel feature matching methods.<n>MESA establishes implicit-semantic area matching prior to point matching, based on advanced image understanding of SAM.<n>With less repetitive computation, DMESA showcases a speed improvement of nearly five times compared to MESA.
arXiv Detail & Related papers (2024-08-01T04:39:36Z)
AffineGlue: Joint Matching and Robust Estimation [74.04609046690913]
We propose AffineGlue, a method for joint two-view feature matching and robust estimation. AffineGlue selects potential matches from one-to-many correspondences to estimate minimal models. Guided matching is then used to find matches consistent with the model, suffering less from the ambiguities of one-to-one matches.
arXiv Detail & Related papers (2023-07-28T08:05:36Z)
Searching from Area to Point: A Hierarchical Framework for Semantic-Geometric Combined Feature Matching [16.16319526547664]
We set the initial search space for point matching as the matched image areas containing prominent semantic, named semantic area matches.<n>This search space favors point matching by salient features and alleviates the accuracy limitation in recent Transformer-based matching methods.<n>We propose a hierarchical feature matching framework: Area to Point Matching (A2PM), to first find semantic area matches between images and later perform point matching on area matches.
arXiv Detail & Related papers (2023-04-29T08:16:12Z)
Adaptive Spot-Guided Transformer for Consistent Local Feature Matching [64.30749838423922]
We propose Adaptive Spot-Guided Transformer (ASTR) for local feature matching. ASTR models the local consistency and scale variations in a unified coarse-to-fine architecture.
arXiv Detail & Related papers (2023-03-29T12:28:01Z)
Federated Minimax Optimization: Improved Convergence Analyses and Algorithms [32.062312674333775]
We consider non minimax optimization, is gaining prominence many modern machine learning applications such as GANs. We provide a novel and tighter analysis algorithm, improves convergence communication guarantees in the existing literature.
arXiv Detail & Related papers (2022-03-09T16:21:31Z)
Guide Local Feature Matching by Overlap Estimation [9.387323456222823]
We introduce a novel Overlap Estimation method conditioned on image pairs with TRansformer, named OETR. OETR performs overlap estimation in a two-step process of feature correlation and then overlap regression. Experiments show that OETR can boost state-of-the-art local feature matching performance substantially.
arXiv Detail & Related papers (2022-02-18T07:11:36Z)
Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time. We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z)
ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency. This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation. In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z)
Higher Performance Visual Tracking with Dual-Modal Localization [106.91097443275035]
Visual Object Tracking (VOT) has synchronous needs for both robustness and accuracy. We propose a dual-modal framework for target localization, consisting of robust localization suppressingors via ONR and the accurate localization attending to the target center precisely via OFC.
arXiv Detail & Related papers (2021-03-18T08:47:56Z)
Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences. We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline. Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.