Robust Scene Coordinate Regression via Geometrically-Consistent Global Descriptors
- URL: http://arxiv.org/abs/2512.17226v1
- Date: Fri, 19 Dec 2025 04:24:03 GMT
- Title: Robust Scene Coordinate Regression via Geometrically-Consistent Global Descriptors
- Authors: Son Tung Nguyen, Tobias Fischer, Alejandro Fontan, Michael Milford,
- Abstract summary: We propose an aggregator module that learns global descriptors consistent with both geometrical structure and visual similarity.<n>This corrects erroneous associations caused by unreliable overlap scores.<n>Experiments on challenging benchmarks show substantial localization gains in large-scale environments.
- Score: 52.57327385675752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent learning-based visual localization methods use global descriptors to disambiguate visually similar places, but existing approaches often derive these descriptors from geometric cues alone (e.g., covisibility graphs), limiting their discriminative power and reducing robustness in the presence of noisy geometric constraints. We propose an aggregator module that learns global descriptors consistent with both geometrical structure and visual similarity, ensuring that images are close in descriptor space only when they are visually similar and spatially connected. This corrects erroneous associations caused by unreliable overlap scores. Using a batch-mining strategy based solely on the overlap scores and a modified contrastive loss, our method trains without manual place labels and generalizes across diverse environments. Experiments on challenging benchmarks show substantial localization gains in large-scale environments while preserving computational and memory efficiency. Code is available at \href{https://github.com/sontung/robust\_scr}{github.com/sontung/robust\_scr}.
Related papers
- REGRACE: A Robust and Efficient Graph-based Re-localization Algorithm using Consistency Evaluation [23.41000678070751]
Loop closures are essential for correcting odometry drift and creating consistent maps.<n>Current methods using dense point clouds for accurate place recognition do not scale well due to computationally expensive scan-to-scan comparisons.<n>We introduce REGRACE, a novel approach that addresses these challenges of scalability and perspective difference in re-localization.
arXiv Detail & Related papers (2025-03-05T15:32:38Z) - FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization [52.57327385675752]
Direct 2D-3D matching requires significantly less memory but suffers from lower accuracy due to the larger and more ambiguous search space.<n>We address this ambiguity by fusing local and global descriptors using a weighted average operator.<n>We achieve performance close to hierarchical methods while using 43% less memory and running 1.6 times faster.
arXiv Detail & Related papers (2024-08-21T23:42:16Z) - Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching [0.0]
We propose a new technique, based on graph Laplacian eigenmaps, to match point clouds by taking into account fine local structures.
To deal with the order and sign ambiguity of Laplacian eigenmaps, we introduce a new operator, called Coupled Laplacian.
We show that the similarity between those aligned high-dimensional spaces provides a locally meaningful score to match shapes.
arXiv Detail & Related papers (2024-02-27T10:10:12Z) - High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation [17.804090651425955]
Image-level weakly-supervised segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training.
Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss.
We reformulate both techniques based on binomial posteriors of multiple independent binary problems.
This has two benefits; their performance is improved and they become more general, resulting in an add-on method that can boost virtually any WSSS method.
arXiv Detail & Related papers (2023-04-05T17:43:57Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - DenseGAP: Graph-Structured Dense Correspondence Learning with Anchor
Points [15.953570826460869]
Establishing dense correspondence between two images is a fundamental computer vision problem.
We introduce DenseGAP, a new solution for efficient Dense correspondence learning with a Graph-structured neural network conditioned on Anchor Points.
Our method advances the state-of-the-art of correspondence learning on most benchmarks.
arXiv Detail & Related papers (2021-12-13T18:59:30Z) - Viewpoint Invariant Dense Matching for Visual Geolocalization [15.8038460597256]
We propose a novel method for image matching based on dense local features and tailored for visual geolocalization.
Our method, called GeoWarp, directly embeds invariance to viewpoint shifts in the process of extracting dense features.
GeoWarp is implemented efficiently as a re-ranking method that can be easily embedded into pre-existing visual geolocalization pipelines.
arXiv Detail & Related papers (2021-09-20T20:17:38Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.