DDM-NET: End-to-end learning of keypoint feature Detection, Description
and Matching for 3D localization
- URL: http://arxiv.org/abs/2212.04575v1
- Date: Thu, 8 Dec 2022 21:43:56 GMT
- Title: DDM-NET: End-to-end learning of keypoint feature Detection, Description
and Matching for 3D localization
- Authors: Xiangyu Xu, Li Guan, Enrique Dunn, Haoxiang Li, Guang Hua
- Abstract summary: We propose an end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching.
We design a self-supervised image warping correspondence loss for both feature detection and matching.
We also propose a new loss to robustly handle both definite inlier/outlier matches and less-certain matches.
- Score: 34.66510265193038
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose an end-to-end framework that jointly learns
keypoint detection, descriptor representation and cross-frame matching for the
task of image-based 3D localization. Prior art has tackled each of these
components individually, purportedly aiming to alleviate difficulties in
effectively train a holistic network. We design a self-supervised image warping
correspondence loss for both feature detection and matching, a
weakly-supervised epipolar constraints loss on relative camera pose learning,
and a directional matching scheme that detects key-point features in a source
image and performs coarse-to-fine correspondence search on the target image. We
leverage this framework to enforce cycle consistency in our matching module. In
addition, we propose a new loss to robustly handle both definite inlier/outlier
matches and less-certain matches. The integration of these learning mechanisms
enables end-to-end training of a single network performing all three
localization components. Bench-marking our approach on public data-sets,
exemplifies how such an end-to-end framework is able to yield more accurate
localization that out-performs both traditional methods as well as
state-of-the-art weakly supervised methods.
Related papers
- PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant
Semantic Segmentation [50.556961575275345]
We propose a perception-aware fusion framework to promote segmentation robustness in adversarial scenes.
We show that our scheme substantially enhances the robustness, with gains of 15.3% mIOU, compared with advanced competitors.
arXiv Detail & Related papers (2023-08-08T01:55:44Z) - Self-Supervised Image-to-Point Distillation via Semantically Tolerant
Contrastive Loss [18.485918870427327]
We propose a novel semantically tolerant image-to-point contrastive loss that takes into consideration the semantic distance between positive and negative image regions.
Our method consistently outperforms state-of-the-art 2D-to-3D representation learning frameworks across a wide range of 2D self-supervised pretrained models.
arXiv Detail & Related papers (2023-01-12T19:58:54Z) - Shared Coupling-bridge for Weakly Supervised Local Feature Learning [0.7366405857677226]
This paper focuses on promoting the currently popular sparse local feature learning with camera pose supervision.
It proposes a Shared Coupling-bridge scheme with four light-weight yet effective improvements for weakly-supervised local feature learning.
It could often obtain a state-of-the-art performance on classic image matching and visual localization.
arXiv Detail & Related papers (2022-12-14T05:47:52Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - Sim2Real Object-Centric Keypoint Detection and Description [40.58367357980036]
Keypoint detection and description play a central role in computer vision.
We propose the object-centric formulation, which requires further identifying which object each interest point belongs to.
We develop a sim2real contrastive learning mechanism that can generalize the model trained in simulation to real-world applications.
arXiv Detail & Related papers (2022-02-01T15:00:20Z) - P2-Net: Joint Description and Detection of Local Features for Pixel and
Point Matching [78.18641868402901]
This work takes the initiative to establish fine-grained correspondences between 2D images and 3D point clouds.
An ultra-wide reception mechanism in combination with a novel loss function are designed to mitigate the intrinsic information variations between pixel and point local regions.
arXiv Detail & Related papers (2021-03-01T14:59:40Z) - Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [38.38520763114715]
We present Patch2Pix, a novel refinement network that refines match proposals by regressing pixel-level matches from the local regions defined by those proposals.
We show that our refinement network significantly improves the performance of correspondence networks on image matching, homography estimation, and localization tasks.
arXiv Detail & Related papers (2020-12-03T13:44:02Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z) - EPNet: Enhancing Point Features with Image Semantics for 3D Object
Detection [60.097873683615695]
We aim at addressing two critical issues in the 3D detection task, including the exploitation of multiple sensors.
We propose a novel fusion module to enhance the point features with semantic image features in a point-wise manner without any image annotations.
We design an end-to-end learnable framework named EPNet to integrate these two components.
arXiv Detail & Related papers (2020-07-17T09:33:05Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.