Related papers: iMatching: Imperative Correspondence Learning

iMatching: Imperative Correspondence Learning

URL: http://arxiv.org/abs/2312.02141v1
Date: Mon, 4 Dec 2023 18:58:20 GMT
Title: iMatching: Imperative Correspondence Learning
Authors: Zitong Zhan, Dasong Gao, Yun-Jou Lin, Youjie Xia, Chen Wang
Abstract summary: We introduce a new self-supervised scheme, imperative learning (IL), for training feature correspondence. It enables correspondence learning on arbitrary uninterrupted videos without any camera pose or depth labels. We demonstrate superior performance on tasks including feature matching and pose estimation.
Score: 5.974164730742711
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning feature correspondence is a foundational task in computer vision, holding immense importance for downstream applications such as visual odometry and 3D reconstruction. Despite recent progress in data-driven models, feature correspondence learning is still limited by the lack of accurate per-pixel correspondence labels. To overcome this difficulty, we introduce a new self-supervised scheme, imperative learning (IL), for training feature correspondence. It enables correspondence learning on arbitrary uninterrupted videos without any camera pose or depth labels, heralding a new era for self-supervised correspondence learning. Specifically, we formulated the problem of correspondence learning as a bilevel optimization, which takes the reprojection error from bundle adjustment as a supervisory signal for the model. To avoid large memory and computation overhead, we leverage the stationary point to effectively back-propagate the implicit gradients through bundle adjustment. Through extensive experiments, we demonstrate superior performance on tasks including feature matching and pose estimation, in which we obtained an average of 30% accuracy gain over the state-of-the-art matching models.

Related papers

Learn 3D VQA Better with Active Selection and Reannotation [46.687613392366174]
In 3D VQA, the free-form nature of answers often leads to improper annotations that can confuse or mislead models when training on the entire dataset.<n>We propose a multi-turn interactive active learning strategy that selects data based on models' semantic uncertainty to form a solid knowledge foundation.<n>Experiments exhibit better model performance and a substantial reduction in training costs, with a halving of training costs for achieving relatively high accuracy.
arXiv Detail & Related papers (2025-07-07T03:18:54Z)
Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels [69.58063088519852]
We propose to improve semantic correspondence estimation via 3D-aware pseudo-labeling.<n>Specifically, we train an adapter to refine off-the-shelf features using pseudo-labels obtained via 3D-aware chaining.<n>While reducing the need for dataset specific annotations, we set a new state-of-the-art on SPair-71k by over 4% absolute gain.
arXiv Detail & Related papers (2025-06-05T17:54:33Z)
Learning Affine Correspondences by Integrating Geometric Constraints [30.695253062973784]
We present a new pipeline designed for extracting accurate affine correspondences by integrating dense matching and geometric constraints. Specifically, a novel extraction framework is introduced, with the aid of dense matching and a novel keypoint scale and orientation estimator. The experimental show that the accuracy and robustness of our method outperform the existing ones in image matching tasks.
arXiv Detail & Related papers (2025-04-07T08:44:50Z)
Gradient Boosting Mapping for Dimensionality Reduction and Feature Extraction [2.778647101651566]
A fundamental problem in supervised learning is to find a good set of features or distance measures. We propose a supervised dimensionality reduction method, where the outputs of weak learners define the embedding. We show that the embedding coordinates provide better features for the supervised learning task.
arXiv Detail & Related papers (2024-05-14T10:23:57Z)
Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps [39.00415825387414]
We propose a new approach for semantic correspondence estimation that supplements discriminative features with 3D understanding via a weak geometric spherical prior. Compared to more involved 3D pipelines, our model only requires weak viewpoint information, and the simplicity of our spherical representation enables us to inject informative geometric priors into the model during training. We present results on the challenging SPair-71k dataset, where our approach demonstrates is capable of distinguishing between symmetric views and repeated parts across many object categories.
arXiv Detail & Related papers (2023-12-20T17:35:24Z)
Match me if you can: Semi-Supervised Semantic Correspondence Learning with Unpaired Images [76.47980643420375]
This paper builds on the hypothesis that there is an inherent data-hungry matter in learning semantic correspondences. We demonstrate a simple machine annotator reliably enriches paired key points via machine supervision. Our models surpass current state-of-the-art models on semantic correspondence learning benchmarks like SPair-71k, PF-PASCAL, and PF-WILLOW.
arXiv Detail & Related papers (2023-11-30T13:22:15Z)
Q-REG: End-to-End Trainable Point Cloud Registration with Surface Curvature [81.25511385257344]
We present a novel solution, Q-REG, which utilizes rich geometric information to estimate the rigid pose from a single correspondence. Q-REG allows to formalize the robust estimation as an exhaustive search, hence enabling end-to-end training. We demonstrate in the experiments that Q-REG is agnostic to the correspondence matching method and provides consistent improvement both when used only in inference and in end-to-end training.
arXiv Detail & Related papers (2023-09-27T20:58:53Z)
To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge Graph Completion [35.05965140700747]
We extend embedding models by allowing to explicitly copy target information from related factual triples for more accurate prediction. We also propose a novel relative distance based negative sampling technique (ReD) for more effective optimization.
arXiv Detail & Related papers (2023-05-23T14:53:20Z)
Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision. We show that it is equally important to ensure that the accumulated embeddings are up to date. In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z)
S$^2$Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning [70.72037296392642]
We propose a novel semi-supervised framework that allows us to learn contact from monocular images. Specifically, we leverage visual and geometric consistency constraints in large-scale datasets for generating pseudo-labels. We show benefits from using a contact map that rules hand-object interactions to produce more accurate reconstructions.
arXiv Detail & Related papers (2022-08-01T14:05:23Z)
Self-Supervised 3D Hand Pose Estimation from monocular RGB via Contrastive Learning [50.007445752513625]
We propose a new self-supervised method for the structured regression task of 3D hand pose estimation. We experimentally investigate the impact of invariant and equivariant contrastive objectives. We show that a standard ResNet-152, trained on additional unlabeled data, attains an improvement of $7.6%$ in PA-EPE on FreiHAND.
arXiv Detail & Related papers (2021-06-10T17:48:57Z)
Warp Consistency for Unsupervised Learning of Dense Correspondences [116.56251250853488]
Key challenge in learning dense correspondences is lack of ground-truth matches for real image pairs. We propose Warp Consistency, an unsupervised learning objective for dense correspondence regression. Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, RobotCar and TSS.
arXiv Detail & Related papers (2021-04-07T17:58:22Z)
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks. We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task. Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.