Adaptive Spot-Guided Transformer for Consistent Local Feature Matching
- URL: http://arxiv.org/abs/2303.16624v1
- Date: Wed, 29 Mar 2023 12:28:01 GMT
- Title: Adaptive Spot-Guided Transformer for Consistent Local Feature Matching
- Authors: Jiahuan Yu, Jiahao Chang, Jianfeng He, Tianzhu Zhang, Feng Wu
- Abstract summary: We propose Adaptive Spot-Guided Transformer (ASTR) for local feature matching.
ASTR models the local consistency and scale variations in a unified coarse-to-fine architecture.
- Score: 64.30749838423922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Local feature matching aims at finding correspondences between a pair of
images. Although current detector-free methods leverage Transformer
architecture to obtain an impressive performance, few works consider
maintaining local consistency. Meanwhile, most methods struggle with large
scale variations. To deal with the above issues, we propose Adaptive
Spot-Guided Transformer (ASTR) for local feature matching, which jointly models
the local consistency and scale variations in a unified coarse-to-fine
architecture. The proposed ASTR enjoys several merits. First, we design a
spot-guided aggregation module to avoid interfering with irrelevant areas
during feature aggregation. Second, we design an adaptive scaling module to
adjust the size of grids according to the calculated depth information at fine
stage. Extensive experimental results on five standard benchmarks demonstrate
that our ASTR performs favorably against state-of-the-art methods. Our code
will be released on https://astr2023.github.io.
Related papers
- Multi-Level Aggregation and Recursive Alignment Architecture for Efficient Parallel Inference Segmentation Network [18.47001817385548]
We propose a parallel inference network customized for semantic segmentation tasks.
We employ a shallow backbone to ensure real-time speed, and propose three core components to compensate for the reduced model capacity to improve accuracy.
Our framework shows a better balance between speed and accuracy than state-of-the-art real-time methods on Cityscapes and CamVid datasets.
arXiv Detail & Related papers (2024-02-03T22:51:17Z) - Exploring Consistency in Cross-Domain Transformer for Domain Adaptive
Semantic Segmentation [51.10389829070684]
Domain gap can cause discrepancies in self-attention.
Due to this gap, the transformer attends to spurious regions or pixels, which deteriorates accuracy on the target domain.
We propose adaptation on attention maps with cross-domain attention layers.
arXiv Detail & Related papers (2022-11-27T02:40:33Z) - LCPFormer: Towards Effective 3D Point Cloud Analysis via Local Context
Propagation in Transformers [60.51925353387151]
We propose a novel module named Local Context Propagation (LCP) to exploit the message passing between neighboring local regions.
We use the overlap points of adjacent local regions as intermediaries, then re-weight the features of these shared points from different local regions before passing them to the next layers.
The proposed method is applicable to different tasks and outperforms various transformer-based methods in benchmarks including 3D shape classification and dense prediction tasks.
arXiv Detail & Related papers (2022-10-23T15:43:01Z) - Guide Local Feature Matching by Overlap Estimation [9.387323456222823]
We introduce a novel Overlap Estimation method conditioned on image pairs with TRansformer, named OETR.
OETR performs overlap estimation in a two-step process of feature correlation and then overlap regression.
Experiments show that OETR can boost state-of-the-art local feature matching performance substantially.
arXiv Detail & Related papers (2022-02-18T07:11:36Z) - Cost Aggregation Is All You Need for Few-Shot Segmentation [28.23753949369226]
We introduce Volumetric Aggregation with Transformers (VAT) to tackle the few-shot segmentation task.
VAT uses both convolutions and transformers to efficiently handle high dimensional correlation maps between query and support.
We find that the proposed method attains state-of-the-art performance even for the standard benchmarks in semantic correspondence task.
arXiv Detail & Related papers (2021-12-22T06:18:51Z) - I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage
Object Detectors [64.93963042395976]
Implicit Instance-Invariant Network (I3Net) is tailored for adapting one-stage detectors.
I3Net implicitly learns instance-invariant features via exploiting the natural characteristics of deep features in different layers.
Experiments reveal that I3Net exceeds the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2021-03-25T11:14:36Z) - Align Deep Features for Oriented Object Detection [40.28244152216309]
We propose a single-shot Alignment Network (S$2$A-Net) consisting of two modules: a Feature Alignment Module (FAM) and an Oriented Detection Module (ODM)
The FAM can generate high-quality anchors with an Anchor Refinement Network and adaptively align the convolutional features according to the anchor boxes with a novel Alignment Convolution.
The ODM first adopts active rotating filters to encode the orientation information and then produces orientation-sensitive and orientation-invariant features to alleviate the inconsistency between classification score and localization accuracy.
arXiv Detail & Related papers (2020-08-21T09:55:13Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z) - Self-Guided Adaptation: Progressive Representation Alignment for Domain
Adaptive Object Detection [86.69077525494106]
Unsupervised domain adaptation (UDA) has achieved unprecedented success in improving the cross-domain robustness of object detection models.
Existing UDA methods largely ignore the instantaneous data distribution during model learning, which could deteriorate the feature representation given large domain shift.
We propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains.
arXiv Detail & Related papers (2020-03-19T13:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.