ResMatch: Residual Attention Learning for Local Feature Matching
- URL: http://arxiv.org/abs/2307.05180v1
- Date: Tue, 11 Jul 2023 11:32:12 GMT
- Title: ResMatch: Residual Attention Learning for Local Feature Matching
- Authors: Yuxin Deng and Jiayi Ma
- Abstract summary: We rethink cross- and self-attention from the viewpoint of traditional feature matching and filtering.
We inject the similarity of descriptors and relative positions into cross- and self-attention score.
We mine intra- and inter-neighbors according to the similarity of descriptors and relative positions.
- Score: 51.07496081296863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attention-based graph neural networks have made great progress in feature
matching learning. However, insight of how attention mechanism works for
feature matching is lacked in the literature. In this paper, we rethink cross-
and self-attention from the viewpoint of traditional feature matching and
filtering. In order to facilitate the learning of matching and filtering, we
inject the similarity of descriptors and relative positions into cross- and
self-attention score, respectively. In this way, the attention can focus on
learning residual matching and filtering functions with reference to the basic
functions of measuring visual and spatial correlation. Moreover, we mine intra-
and inter-neighbors according to the similarity of descriptors and relative
positions. Then sparse attention for each point can be performed only within
its neighborhoods to acquire higher computation efficiency. Feature matching
networks equipped with our full and sparse residual attention learning
strategies are termed ResMatch and sResMatch respectively. Extensive
experiments, including feature matching, pose estimation and visual
localization, confirm the superiority of our networks.
Related papers
- Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
Network [52.29330138835208]
Accurately matching local features between a pair of images is a challenging computer vision task.
Previous studies typically use attention based graph neural networks (GNNs) with fully-connected graphs over keypoints within/across images.
We propose MaKeGNN, a sparse attention-based GNN architecture which bypasses non-repeatable keypoints and leverages matchable ones to guide message passing.
arXiv Detail & Related papers (2023-07-04T02:50:44Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Kinship Verification Based on Cross-Generation Feature Interaction
Learning [53.62256887837659]
Kinship verification from facial images has been recognized as an emerging yet challenging technique in computer vision applications.
We propose a novel cross-generation feature interaction learning (CFIL) framework for robust kinship verification.
arXiv Detail & Related papers (2021-09-07T01:50:50Z) - Learning to Match Features with Seeded Graph Matching Network [35.70116378238535]
We propose Seeded Graph Matching Network, a graph neural network with sparse structure to reduce redundant connectivity and learn compact representation.
Experiments show that our method reduces computational and memory complexity significantly compared with typical attention-based networks.
arXiv Detail & Related papers (2021-08-19T16:25:23Z) - Toward Understanding the Feature Learning Process of Self-supervised
Contrastive Learning [43.504548777955854]
We study how contrastive learning learns the feature representations for neural networks by analyzing its feature learning process.
We prove that contrastive learning using textbfReLU networks provably learns the desired sparse features if proper augmentations are adopted.
arXiv Detail & Related papers (2021-05-31T16:42:09Z) - Variational Structured Attention Networks for Deep Visual Representation
Learning [49.80498066480928]
We propose a unified deep framework to jointly learn both spatial attention maps and channel attention in a principled manner.
Specifically, we integrate the estimation and the interaction of the attentions within a probabilistic representation learning framework.
We implement the inference rules within the neural network, thus allowing for end-to-end learning of the probabilistic and the CNN front-end parameters.
arXiv Detail & Related papers (2021-03-05T07:37:24Z) - One Point is All You Need: Directional Attention Point for Feature
Learning [51.44837108615402]
We present a novel attention-based mechanism for learning enhanced point features for tasks such as point cloud classification and segmentation.
We show that our attention mechanism can be easily incorporated into state-of-the-art point cloud classification and segmentation networks.
arXiv Detail & Related papers (2020-12-11T11:45:39Z) - Attention improves concentration when learning node embeddings [1.2233362977312945]
Given nodes labelled with search query text, we want to predict links to related queries that share products.
Experiments with a range of deep neural architectures show that simple feedforward networks with an attention mechanism perform best for learning embeddings.
We propose an analytically tractable model of query generation, AttEST, that views both products and the query text as vectors embedded in a latent space.
arXiv Detail & Related papers (2020-06-11T21:21:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.