Semantic Correspondence with Transformers
- URL: http://arxiv.org/abs/2106.02520v1
- Date: Fri, 4 Jun 2021 14:39:03 GMT
- Title: Semantic Correspondence with Transformers
- Authors: Seokju Cho, Sunghwan Hong, Sangryul Jeon, Yunsung Lee, Kwanghoon Sohn
and Seungryong Kim
- Abstract summary: We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images.
We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation.
We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
- Score: 68.37049687360705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel cost aggregation network, called Cost Aggregation with
Transformers (CATs), to find dense correspondences between semantically similar
images with additional challenges posed by large intra-class appearance and
geometric variations. Compared to previous hand-crafted or CNN-based methods
addressing the cost aggregation stage, which either lack robustness to severe
deformations or inherit the limitation of CNNs that fail to discriminate
incorrect matches due to limited receptive fields, CATs explore global
consensus among initial correlation map with the help of some architectural
designs that allow us to exploit full potential of self-attention mechanism.
Specifically, we include appearance affinity modelling to disambiguate the
initial correlation maps and multi-level aggregation to benefit from
hierarchical feature representations within Transformer-based aggregator, and
combine with swapping self-attention and residual connections not only to
enforce consistent matching, but also to ease the learning process. We conduct
experiments to demonstrate the effectiveness of the proposed model over the
latest methods and provide extensive ablation studies. Code and trained models
will be made available at https://github.com/SunghwanHong/CATs.
Related papers
- On Layer-wise Representation Similarity: Application for Multi-Exit Models with a Single Classifier [20.17288970927518]
We study the similarity of representations between the hidden layers of individual transformers.
We propose an aligned training approach to enhance the similarity between internal representations.
arXiv Detail & Related papers (2024-06-20T16:41:09Z) - Prototype-based Embedding Network for Scene Graph Generation [105.97836135784794]
Current Scene Graph Generation (SGG) methods explore contextual information to predict relationships among entity pairs.
Due to the diverse visual appearance of numerous possible subject-object combinations, there is a large intra-class variation within each predicate category.
Prototype-based Embedding Network (PE-Net) models entities/predicates with prototype-aligned compact and distinctive representations.
PL is introduced to help PE-Net efficiently learn such entitypredicate matching, and Prototype Regularization (PR) is devised to relieve the ambiguous entity-predicate matching.
arXiv Detail & Related papers (2023-03-13T13:30:59Z) - FECANet: Boosting Few-Shot Semantic Segmentation with Feature-Enhanced
Context-Aware Network [48.912196729711624]
Few-shot semantic segmentation is the task of learning to locate each pixel of a novel class in a query image with only a few annotated support images.
We propose a Feature-Enhanced Context-Aware Network (FECANet) to suppress the matching noise caused by inter-class local similarity.
In addition, we propose a novel correlation reconstruction module that encodes extra correspondence relations between foreground and background and multi-scale context semantic features.
arXiv Detail & Related papers (2023-01-19T16:31:13Z) - Switchable Representation Learning Framework with Self-compatibility [50.48336074436792]
We propose a Switchable representation learning Framework with Self-Compatibility (SFSC)
SFSC generates a series of compatible sub-models with different capacities through one training process.
SFSC achieves state-of-the-art performance on the evaluated datasets.
arXiv Detail & Related papers (2022-06-16T16:46:32Z) - Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank.
Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z) - CATs++: Boosting Cost Aggregation with Convolutions and Transformers [31.22435282922934]
We introduce Cost Aggregation with Transformers (CATs) to tackle this by exploring global consensus among initial correlation map.
Also, to alleviate some of the limitations that CATs may face, i.e., high computational costs induced by the use of a standard transformer, we propose CATs++.
Our proposed methods outperform the previous state-of-the-art methods by large margins, setting a new state-of-the-art for all the benchmarks.
arXiv Detail & Related papers (2022-02-14T15:54:58Z) - Weakly supervised segmentation with cross-modality equivariant
constraints [7.757293476741071]
Weakly supervised learning has emerged as an appealing alternative to alleviate the need for large labeled datasets in semantic segmentation.
We present a novel learning strategy that leverages self-supervision in a multi-modal image scenario to significantly enhance original CAMs.
Our approach outperforms relevant recent literature under the same learning conditions.
arXiv Detail & Related papers (2021-04-06T13:14:20Z) - Context Decoupling Augmentation for Weakly Supervised Semantic
Segmentation [53.49821324597837]
Weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years.
We present a Context Decoupling Augmentation ( CDA) method to change the inherent context in which the objects appear.
To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-03-02T15:05:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.