Shared Coupling-bridge for Weakly Supervised Local Feature Learning
- URL: http://arxiv.org/abs/2212.07047v1
- Date: Wed, 14 Dec 2022 05:47:52 GMT
- Title: Shared Coupling-bridge for Weakly Supervised Local Feature Learning
- Authors: Jiayuan Sun, Jiewen Zhu, Luping Ji
- Abstract summary: This paper focuses on promoting the currently popular sparse local feature learning with camera pose supervision.
It proposes a Shared Coupling-bridge scheme with four light-weight yet effective improvements for weakly-supervised local feature learning.
It could often obtain a state-of-the-art performance on classic image matching and visual localization.
- Score: 0.7366405857677226
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse local feature extraction is usually believed to be of important
significance in typical vision tasks such as simultaneous localization and
mapping, image matching and 3D reconstruction. At present, it still has some
deficiencies needing further improvement, mainly including the discrimination
power of extracted local descriptors, the localization accuracy of detected
keypoints, and the efficiency of local feature learning. This paper focuses on
promoting the currently popular sparse local feature learning with camera pose
supervision. Therefore, it pertinently proposes a Shared Coupling-bridge scheme
with four light-weight yet effective improvements for weakly-supervised local
feature (SCFeat) learning. It mainly contains: i) the
\emph{Feature-Fusion-ResUNet Backbone} (F2R-Backbone) for local descriptors
learning, ii) a shared coupling-bridge normalization to improve the decoupling
training of description network and detection network, iii) an improved
detection network with peakiness measurement to detect keypoints and iv) the
fundamental matrix error as a reward factor to further optimize feature
detection training. Extensive experiments prove that our SCFeat improvement is
effective. It could often obtain a state-of-the-art performance on classic
image matching and visual localization. In terms of 3D reconstruction, it could
still achieve competitive results. For sharing and communication, our source
codes are available at https://github.com/sunjiayuanro/SCFeat.git.
Related papers
- Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - TOPIQ: A Top-down Approach from Semantics to Distortions for Image
Quality Assessment [53.72721476803585]
Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks.
We propose a top-down approach that uses high-level semantics to guide the IQA network to focus on semantically important local distortion regions.
A key component of our approach is the proposed cross-scale attention mechanism, which calculates attention maps for lower level features.
arXiv Detail & Related papers (2023-08-06T09:08:37Z) - DETR Doesn't Need Multi-Scale or Locality Design [69.56292005230185]
This paper presents an improved DETR detector that maintains a "plain" nature.
It uses a single-scale feature map and global cross-attention calculations without specific locality constraints.
We show that two simple technologies are surprisingly effective within a plain design to compensate for the lack of multi-scale feature maps and locality constraints.
arXiv Detail & Related papers (2023-08-03T17:59:04Z) - Improving Transformer-based Image Matching by Cascaded Capturing
Spatially Informative Keypoints [44.90917854990362]
We propose a transformer-based cascade matching model -- Cascade feature Matching TRansformer (CasMTR)
We use a simple yet effective Non-Maximum Suppression (NMS) post-process to filter keypoints through the confidence map.
CasMTR achieves state-of-the-art performance in indoor and outdoor pose estimation as well as visual localization.
arXiv Detail & Related papers (2023-03-06T04:32:34Z) - Unleash the Potential of Image Branch for Cross-modal 3D Object
Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects.
First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation.
Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z) - DDM-NET: End-to-end learning of keypoint feature Detection, Description
and Matching for 3D localization [34.66510265193038]
We propose an end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching.
We design a self-supervised image warping correspondence loss for both feature detection and matching.
We also propose a new loss to robustly handle both definite inlier/outlier matches and less-certain matches.
arXiv Detail & Related papers (2022-12-08T21:43:56Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Guide Local Feature Matching by Overlap Estimation [9.387323456222823]
We introduce a novel Overlap Estimation method conditioned on image pairs with TRansformer, named OETR.
OETR performs overlap estimation in a two-step process of feature correlation and then overlap regression.
Experiments show that OETR can boost state-of-the-art local feature matching performance substantially.
arXiv Detail & Related papers (2022-02-18T07:11:36Z) - LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place
Recognition [31.105598103211825]
We show that an additional training signal (local consistency loss) can guide the network to learning local features which are consistent across revisits.
We formulate our approach in an end-to-end trainable architecture called LoGG3D-Net.
arXiv Detail & Related papers (2021-09-17T03:32:43Z) - PGL: Prior-Guided Local Self-supervised Learning for 3D Medical Image
Segmentation [87.50205728818601]
We propose a PriorGuided Local (PGL) self-supervised model that learns the region-wise local consistency in the latent feature space.
Our PGL model learns the distinctive representations of local regions, and hence is able to retain structural information.
arXiv Detail & Related papers (2020-11-25T11:03:11Z) - ASLFeat: Learning Local Features of Accurate Shape and Localization [42.70030492742363]
We present ASLFeat, with three light-weight yet effective modifications to mitigate above issues.
First, we resort to deformable convolutional networks to densely estimate and apply local transformation.
Second, we take advantage of the inherent feature hierarchy to restore spatial resolution and low-level details for accurate keypoint localization.
arXiv Detail & Related papers (2020-03-23T04:03:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.