TransforMatcher: Match-to-Match Attention for Semantic Correspondence
- URL: http://arxiv.org/abs/2205.11634v1
- Date: Mon, 23 May 2022 21:02:01 GMT
- Title: TransforMatcher: Match-to-Match Attention for Semantic Correspondence
- Authors: Seungwook Kim, Juhong Min, Minsu Cho
- Abstract summary: We introduce a strong semantic image matching learner, dubbed TransforMatcher, which builds on the success of transformer networks in vision domains.
Unlike existing convolution- or attention-based schemes for correspondence, TransforMatcher performs global match-to-match attention for precise match localization and dynamic refinement.
In experiments, TransforMatcher sets a new state of the art on SPair-71k while performing on par with existing SOTA methods on the PF-PASCAL dataset.
- Score: 48.25709192748133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Establishing correspondences between images remains a challenging task,
especially under large appearance changes due to different viewpoints or
intra-class variations. In this work, we introduce a strong semantic image
matching learner, dubbed TransforMatcher, which builds on the success of
transformer networks in vision domains. Unlike existing convolution- or
attention-based schemes for correspondence, TransforMatcher performs global
match-to-match attention for precise match localization and dynamic refinement.
To handle a large number of matches in a dense correlation map, we develop a
light-weight attention architecture to consider the global match-to-match
interactions. We also propose to utilize a multi-channel correlation map for
refinement, treating the multi-level scores as features instead of a single
score to fully exploit the richer layer-wise semantics. In experiments,
TransforMatcher sets a new state of the art on SPair-71k while performing on
par with existing SOTA methods on the PF-PASCAL dataset.
Related papers
- Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - OAMatcher: An Overlapping Areas-based Network for Accurate Local Feature
Matching [9.006654114778073]
We propose OAMatcher, a detector-free method that imitates humans behavior to generate dense and accurate matches.
OAMatcher predicts overlapping areas to promote effective and clean global context aggregation.
Comprehensive experiments demonstrate that OAMatcher outperforms the state-of-the-art methods on several benchmarks.
arXiv Detail & Related papers (2023-02-12T03:32:45Z) - FECANet: Boosting Few-Shot Semantic Segmentation with Feature-Enhanced
Context-Aware Network [48.912196729711624]
Few-shot semantic segmentation is the task of learning to locate each pixel of a novel class in a query image with only a few annotated support images.
We propose a Feature-Enhanced Context-Aware Network (FECANet) to suppress the matching noise caused by inter-class local similarity.
In addition, we propose a novel correlation reconstruction module that encodes extra correspondence relations between foreground and background and multi-scale context semantic features.
arXiv Detail & Related papers (2023-01-19T16:31:13Z) - DeepMatcher: A Deep Transformer-based Network for Robust and Accurate
Local Feature Matching [9.662752427139496]
We propose a deep Transformer-based network built upon our investigation of local feature matching in detector-free methods.
DeepMatcher captures more human-intuitive and simpler-to-match features.
We show that DeepMatcher significantly outperforms the state-of-the-art methods on several benchmarks.
arXiv Detail & Related papers (2023-01-08T07:15:09Z) - Global-and-Local Collaborative Learning for Co-Salient Object Detection [162.62642867056385]
The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images.
We propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM)
The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images)
arXiv Detail & Related papers (2022-04-19T14:32:41Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - GOCor: Bringing Globally Optimized Correspondence Volumes into Your
Neural Network [176.3781969089004]
Feature correlation layer serves as a key neural network module in computer vision problems that involve dense correspondences between image pairs.
We propose GOCor, a fully differentiable dense matching module, acting as a direct replacement to the feature correlation layer.
Our approach significantly outperforms the feature correlation layer for the tasks of geometric matching, optical flow, and dense semantic matching.
arXiv Detail & Related papers (2020-09-16T17:33:01Z) - Learning to Compose Hypercolumns for Visual Correspondence [57.93635236871264]
We introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match.
The proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network.
arXiv Detail & Related papers (2020-07-21T04:03:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.