Learning Accurate Template Matching with Differentiable Coarse-to-Fine
Correspondence Refinement
- URL: http://arxiv.org/abs/2303.08438v1
- Date: Wed, 15 Mar 2023 08:24:10 GMT
- Title: Learning Accurate Template Matching with Differentiable Coarse-to-Fine
Correspondence Refinement
- Authors: Zhirui Gao, Renjiao Yi, Zheng Qin, Yunfan Ye, Chenyang Zhu, and Kai Xu
- Abstract summary: We propose an accurate template matching method based on differentiable coarse-to-fine correspondence refinement.
An initial warp is estimated using coarse correspondences based on novel structure-aware information provided by transformers.
Our method is significantly better than state-of-the-art methods and baselines, providing good generalization ability and visually plausible results even on unseen real data.
- Score: 28.00275083733545
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Template matching is a fundamental task in computer vision and has been
studied for decades. It plays an essential role in manufacturing industry for
estimating the poses of different parts, facilitating downstream tasks such as
robotic grasping. Existing methods fail when the template and source images
have different modalities, cluttered backgrounds or weak textures. They also
rarely consider geometric transformations via homographies, which commonly
exist even for planar industrial parts. To tackle the challenges, we propose an
accurate template matching method based on differentiable coarse-to-fine
correspondence refinement. We use an edge-aware module to overcome the domain
gap between the mask template and the grayscale image, allowing robust
matching. An initial warp is estimated using coarse correspondences based on
novel structure-aware information provided by transformers. This initial
alignment is passed to a refinement network using references and aligned images
to obtain sub-pixel level correspondences which are used to give the final
geometric transformation. Extensive evaluation shows that our method is
significantly better than state-of-the-art methods and baselines, providing
good generalization ability and visually plausible results even on unseen real
data.
Related papers
- ConDL: Detector-Free Dense Image Matching [2.7582789611575897]
We introduce a deep-learning framework designed for estimating dense image correspondences.
Our fully convolutional model generates dense feature maps for images, where each pixel is associated with a descriptor that can be matched across multiple images.
arXiv Detail & Related papers (2024-08-05T18:34:15Z) - SHIC: Shape-Image Correspondences with no Keypoint Supervision [106.99157362200867]
Canonical surface mapping generalizes keypoint detection by assigning each pixel of an object to a corresponding point in a 3D template.
Popularised by DensePose for the analysis of humans, authors have attempted to apply the concept to more categories.
We introduce SHIC, a method to learn canonical maps without manual supervision which achieves better results than supervised methods for most categories.
arXiv Detail & Related papers (2024-07-26T17:58:59Z) - PMatch: Paired Masked Image Modeling for Dense Geometric Matching [18.64065915021511]
We propose a novel cross-frame global matching module (CFGM) for geometric matching.
To be robust to the textureless area, we propose a homography loss to further regularize its learning.
We achieve the State-of-The-Art (SoTA) performance on geometric matching.
arXiv Detail & Related papers (2023-03-30T12:53:22Z) - Masked and Adaptive Transformer for Exemplar Based Image Translation [16.93344592811513]
Cross-domain semantic matching is challenging.
We propose a masked and adaptive transformer (MAT) for learning accurate cross-domain correspondence.
We devise a novel contrastive style learning method, for acquire quality-discriminative style representations.
arXiv Detail & Related papers (2023-03-30T03:21:14Z) - RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline
Model and DoF-based Curriculum Learning [62.86400614141706]
We propose a new learning model, i.e., Rectangling Rectification Network (RecRecNet)
Our model can flexibly warp the source structure to the target domain and achieves an end-to-end unsupervised deformation.
Experiments show the superiority of our solution over the compared methods on both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2023-01-04T15:12:57Z) - Modeling Image Composition for Complex Scene Generation [77.10533862854706]
We present a method that achieves state-of-the-art results on layout-to-image generation tasks.
After compressing RGB images into patch tokens, we propose the Transformer with Focal Attention (TwFA) for exploring dependencies of object-to-object, object-to-patch and patch-to-patch.
arXiv Detail & Related papers (2022-06-02T08:34:25Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations.
These transformations also use information from both within-class and across-class representations that we extract through clustering.
We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z) - ProAlignNet : Unsupervised Learning for Progressively Aligning Noisy
Contours [12.791313859673187]
"ProAlignNet" accounts for large scale misalignments and complex transformations between the contour shapes.
It learns by training with a novel loss function which is derived an upperbound of a proximity-sensitive and local shape-dependent similarity metric.
In two real-world applications, the proposed models consistently perform superior to state-of-the-art methods.
arXiv Detail & Related papers (2020-05-23T14:56:14Z) - FDA: Fourier Domain Adaptation for Semantic Segmentation [82.4963423086097]
We describe a simple method for unsupervised domain adaptation, whereby the discrepancy between the source and target distributions is reduced by swapping the low-frequency spectrum of one with the other.
We illustrate the method in semantic segmentation, where densely annotated images are aplenty in one domain, but difficult to obtain in another.
Our results indicate that even simple procedures can discount nuisance variability in the data that more sophisticated methods struggle to learn away.
arXiv Detail & Related papers (2020-04-11T22:20:48Z) - A Robust Method for Image Stitching [0.0]
We propose a novel method for large-scale image stitching that is robust against repetitive patterns and featureless regions in the imagery.
Our method augments the current methods by collecting all the plausible pairwise image registration candidates.
arXiv Detail & Related papers (2020-04-08T07:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.