Semi-supervised Dense Keypoints Using Unlabeled Multiview Images
- URL: http://arxiv.org/abs/2109.09299v2
- Date: Tue, 20 Feb 2024 01:17:26 GMT
- Title: Semi-supervised Dense Keypoints Using Unlabeled Multiview Images
- Authors: Zhixuan Yu, Haozheng Yu, Long Sha, Sujoy Ganguly, Hyun Soo Park
- Abstract summary: This paper presents a new end-to-end semi-supervised framework to learn a dense keypoint detector using unlabeled multiview images.
A key challenge lies in finding the exact correspondences between the dense keypoints in multiple views.
We derive a new probabilistic epipolar constraint that encodes the two desired properties.
- Score: 22.449168666514677
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a new end-to-end semi-supervised framework to learn a
dense keypoint detector using unlabeled multiview images. A key challenge lies
in finding the exact correspondences between the dense keypoints in multiple
views since the inverse of the keypoint mapping can be neither analytically
derived nor differentiated. This limits applying existing multiview supervision
approaches used to learn sparse keypoints that rely on the exact
correspondences. To address this challenge, we derive a new probabilistic
epipolar constraint that encodes the two desired properties. (1) Soft
correspondence: we define a matchability, which measures a likelihood of a
point matching to the other image's corresponding point, thus relaxing the
requirement of the exact correspondences. (2) Geometric consistency: every
point in the continuous correspondence fields must satisfy the multiview
consistency collectively. We formulate a probabilistic epipolar constraint
using a weighted average of epipolar errors through the matchability thereby
generalizing the point-to-point geometric error to the field-to-field geometric
error. This generalization facilitates learning a geometrically coherent dense
keypoint detection model by utilizing a large number of unlabeled multiview
images. Additionally, to prevent degenerative cases, we employ a
distillation-based regularization by using a pretrained model. Finally, we
design a new neural network architecture, made of twin networks, that
effectively minimizes the probabilistic epipolar errors of all possible
correspondences between two view images by building affinity matrices. Our
method shows superior performance compared to existing methods, including
non-differentiable bootstrapping in terms of keypoint accuracy, multiview
consistency, and 3D reconstruction accuracy.
Related papers
- Probabilistically Aligned View-unaligned Clustering with Adaptive Template Selection [32.10307592690486]
Cross-view correspondence (CVC) between instances of the same target from different views is a crucial prerequisite for effortlessly deriving a consistent representation.
We propose to integrate the permutation derivation procedure into the bipartite graph paradigm for view-unaligned clustering.
Specifically, we learn consistent anchors and view-specific graphs by the bipartite graph, and derive permutations applied to the unaligned graphs.
arXiv Detail & Related papers (2024-09-23T10:30:09Z) - Disentangled Representation Learning with the Gromov-Monge Gap [65.73194652234848]
Learning disentangled representations from unlabelled data is a fundamental challenge in machine learning.
We introduce a novel approach to disentangled representation learning based on quadratic optimal transport.
We demonstrate the effectiveness of our approach for quantifying disentanglement across four standard benchmarks.
arXiv Detail & Related papers (2024-07-10T16:51:32Z) - Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
Network [52.29330138835208]
Accurately matching local features between a pair of images is a challenging computer vision task.
Previous studies typically use attention based graph neural networks (GNNs) with fully-connected graphs over keypoints within/across images.
We propose MaKeGNN, a sparse attention-based GNN architecture which bypasses non-repeatable keypoints and leverages matchable ones to guide message passing.
arXiv Detail & Related papers (2023-07-04T02:50:44Z) - A Probabilistic Relaxation of the Two-Stage Object Pose Estimation
Paradigm [0.0]
We propose a matching-free probabilistic formulation for object pose estimation.
It enables unified and concurrent optimization of both visual correspondence and geometric alignment.
It can represent different plausible modes of the entire distribution of likely poses.
arXiv Detail & Related papers (2023-06-01T16:50:40Z) - Semi-Supervised Clustering of Sparse Graphs: Crossing the
Information-Theoretic Threshold [3.6052935394000234]
Block model is a canonical random graph model for clustering and community detection on network-structured data.
No estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below a certain threshold.
We prove that with an arbitrary fraction of the labels feasible throughout the parameter domain.
arXiv Detail & Related papers (2022-05-24T00:03:25Z) - Probabilistic Warp Consistency for Weakly-Supervised Semantic
Correspondences [118.6018141306409]
We propose Probabilistic Warp Consistency, a weakly-supervised learning objective for semantic matching.
We first construct an image triplet by applying a known warp to one of the images in a pair depicting different instances of the same object class.
Our objective also brings substantial improvements in the strongly-supervised regime, when combined with keypoint annotations.
arXiv Detail & Related papers (2022-03-08T18:55:11Z) - Warp Consistency for Unsupervised Learning of Dense Correspondences [116.56251250853488]
Key challenge in learning dense correspondences is lack of ground-truth matches for real image pairs.
We propose Warp Consistency, an unsupervised learning objective for dense correspondence regression.
Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, RobotCar and TSS.
arXiv Detail & Related papers (2021-04-07T17:58:22Z) - Isometric Multi-Shape Matching [50.86135294068138]
Finding correspondences between shapes is a fundamental problem in computer vision and graphics.
While isometries are often studied in shape correspondence problems, they have not been considered explicitly in the multi-matching setting.
We present a suitable optimisation algorithm for solving our formulation and provide a convergence and complexity analysis.
arXiv Detail & Related papers (2020-12-04T15:58:34Z) - Self-Calibration Supported Robust Projective Structure-from-Motion [80.15392629310507]
We propose a unified Structure-from-Motion (SfM) method, in which the matching process is supported by self-calibration constraints.
We show experimental results demonstrating robust multiview matching and accurate camera calibration by exploiting these constraints.
arXiv Detail & Related papers (2020-07-04T08:47:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.