Gromov Wasserstein Optimal Transport for Semantic Correspondences
- URL: http://arxiv.org/abs/2602.03105v1
- Date: Tue, 03 Feb 2026 04:59:28 GMT
- Title: Gromov Wasserstein Optimal Transport for Semantic Correspondences
- Authors: Francis Snelgar, Stephen Gould, Ming Xu, Liang Zheng, Akshay Asthana,
- Abstract summary: We show that we can significantly boost the performance of the DINOv2 baseline, and be competitive and sometimes surpassing state-of-the-art methods.<n>We replace the standard nearest neighbours matching with an optimal transport algorithm that includes a Gromov Wasserstein spatial smoothness prior.
- Score: 38.64509144392513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Establishing correspondences between image pairs is a long studied problem in computer vision. With recent large-scale foundation models showing strong zero-shot performance on downstream tasks including classification and segmentation, there has been interest in using the internal feature maps of these models for the semantic correspondence task. Recent works observe that features from DINOv2 and Stable Diffusion (SD) are complementary, the former producing accurate but sparse correspondences, while the latter produces spatially consistent correspondences. As a result, current state-of-the-art methods for semantic correspondence involve combining features from both models in an ensemble. While the performance of these methods is impressive, they are computationally expensive, requiring evaluating feature maps from large-scale foundation models. In this work we take a different approach, instead replacing SD features with a superior matching algorithm which is imbued with the desirable spatial consistency property. Specifically, we replace the standard nearest neighbours matching with an optimal transport algorithm that includes a Gromov Wasserstein spatial smoothness prior. We show that we can significantly boost the performance of the DINOv2 baseline, and be competitive and sometimes surpassing state-of-the-art methods using Stable Diffusion features, while being 5--10x more efficient. We make code available at https://github.com/fsnelgar/semantic_matching_gwot .
Related papers
- DiffuMatch: Category-Agnostic Spectral Diffusion Priors for Robust Non-rigid Shape Matching [53.39693288324375]
We show that both in-network regularization and functional map training can be replaced with data-driven methods.<n>We first train a generative model of functional maps in the spectral domain using score-based generative modeling.<n>We then exploit the resulting model to promote the structural properties of ground truth functional maps on new shape collections.
arXiv Detail & Related papers (2025-07-31T16:44:54Z) - A Top-down Graph-based Tool for Modeling Classical Semantic Maps: A Crosslinguistic Case Study of Supplementary Adverbs [50.982315553104975]
Semantic map models (SMMs) construct a network-like conceptual space from cross-linguistic instances or forms.<n>Most SMMs are manually built by human experts using bottom-up procedures.<n>We propose a novel graph-based algorithm that automatically generates conceptual spaces and SMMs in a top-down manner.
arXiv Detail & Related papers (2024-12-02T12:06:41Z) - Leveraging Semantic Cues from Foundation Vision Models for Enhanced Local Feature Correspondence [12.602194710071116]
This paper presents a new method that uses semantic cues from foundation vision model features to enhance local feature matching.
We present adapted versions of six existing descriptors, with an average increase in performance of 29% in camera localization.
arXiv Detail & Related papers (2024-10-12T13:45:26Z) - Zero-Shot Image Feature Consensus with Deep Functional Maps [20.988872402347756]
We show that a better correspondence strategy is available, which directly imposes structure on the correspondence field: the functional map.
We demonstrate that our technique yields correspondences that are not only smoother but also more accurate, with the possibility of better reflecting the knowledge embedded in the large-scale vision models that we are studying.
arXiv Detail & Related papers (2024-03-18T17:59:47Z) - Adaptive Assignment for Geometry Aware Local Feature Matching [22.818457285745733]
detector-free feature matching approaches are currently attracting great attention thanks to their excellent performance.
We introduce AdaMatcher, which accomplishes the feature correlation and co-visible area estimation through an elaborate feature interaction module.
AdaMatcher then performs adaptive assignment on patch-level matching while estimating the scales between images, and finally refines the co-visible matches through scale alignment and sub-pixel regression module.
arXiv Detail & Related papers (2022-07-18T08:22:18Z) - Multiway Non-rigid Point Cloud Registration via Learned Functional Map
Synchronization [105.14877281665011]
We present SyNoRiM, a novel way to register multiple non-rigid shapes by synchronizing the maps relating learned functions defined on the point clouds.
We demonstrate via extensive experiments that our method achieves a state-of-the-art performance in registration accuracy.
arXiv Detail & Related papers (2021-11-25T02:37:59Z) - Temporally-Consistent Surface Reconstruction using Metrically-Consistent
Atlases [131.50372468579067]
We propose a method for unsupervised reconstruction of a temporally-consistent sequence of surfaces from a sequence of time-evolving point clouds.
We represent the reconstructed surfaces as atlases computed by a neural network, which enables us to establish correspondences between frames.
Our approach outperforms state-of-the-art ones on several challenging datasets.
arXiv Detail & Related papers (2021-11-12T17:48:25Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.