Generalized Correspondence Matching via Flexible Hierarchical Refinement
and Patch Descriptor Distillation
- URL: http://arxiv.org/abs/2403.05388v1
- Date: Fri, 8 Mar 2024 15:32:18 GMT
- Title: Generalized Correspondence Matching via Flexible Hierarchical Refinement
and Patch Descriptor Distillation
- Authors: Yu Han, Ziwei Long, Yanting Zhang, Jin Wu, Zhijun Fang and Rui Fan
- Abstract summary: Correspondence matching plays a crucial role in numerous robotics applications.
This paper addresses the limitations of deep feature matching (DFM), a state-of-the-art (SoTA) plug-and-play correspondence matching approach.
Our proposed method achieves an overall performance in terms of mean matching accuracy of 0.68, 0.92, and 0.95 with respect to the tolerances of 1, 3, and 5 pixels, respectively.
- Score: 13.802788788420175
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Correspondence matching plays a crucial role in numerous robotics
applications. In comparison to conventional hand-crafted methods and recent
data-driven approaches, there is significant interest in plug-and-play
algorithms that make full use of pre-trained backbone networks for multi-scale
feature extraction and leverage hierarchical refinement strategies to generate
matched correspondences. The primary focus of this paper is to address the
limitations of deep feature matching (DFM), a state-of-the-art (SoTA)
plug-and-play correspondence matching approach. First, we eliminate the
pre-defined threshold employed in the hierarchical refinement process of DFM by
leveraging a more flexible nearest neighbor search strategy, thereby preventing
the exclusion of repetitive yet valid matches during the early stages. Our
second technical contribution is the integration of a patch descriptor, which
extends the applicability of DFM to accommodate a wide range of backbone
networks pre-trained across diverse computer vision tasks, including image
classification, semantic segmentation, and stereo matching. Taking into account
the practical applicability of our method in real-world robotics applications,
we also propose a novel patch descriptor distillation strategy to further
reduce the computational complexity of correspondence matching. Extensive
experiments conducted on three public datasets demonstrate the superior
performance of our proposed method. Specifically, it achieves an overall
performance in terms of mean matching accuracy of 0.68, 0.92, and 0.95 with
respect to the tolerances of 1, 3, and 5 pixels, respectively, on the HPatches
dataset, outperforming all other SoTA algorithms. Our source code, demo video,
and supplement are publicly available at mias.group/GCM.
Related papers
- Efficient Network Embedding by Approximate Equitable Partitions [0.15978270011184256]
We introduce a simple and efficient embedding technique based on approximate variants of equitable partitions.
We exploit a relationship between equitable partitions and equivalence relations for Markov chains and ordinary differential equations to develop a partition refinement algorithm.
We report comparable -- when not superior -- performance for visualization, classification, and regression tasks at a cost between one and three orders of magnitude smaller.
arXiv Detail & Related papers (2024-09-16T10:51:24Z) - Fully Differentiable Correlation-driven 2D/3D Registration for X-ray to CT Image Fusion [3.868072865207522]
Image-based rigid 2D/3D registration is a critical technique for fluoroscopic guided surgical interventions.
We propose a novel fully differentiable correlation-driven network using a dual-branch CNN-transformer encoder.
A correlation-driven loss is proposed for low-frequency feature and high-frequency feature decomposition based on embedded information.
arXiv Detail & Related papers (2024-02-04T14:12:51Z) - Feature Decoupling-Recycling Network for Fast Interactive Segmentation [79.22497777645806]
Recent interactive segmentation methods iteratively take source image, user guidance and previously predicted mask as the input.
We propose the Feature Decoupling-Recycling Network (FDRN), which decouples the modeling components based on their intrinsic discrepancies.
arXiv Detail & Related papers (2023-08-07T12:26:34Z) - Object Segmentation by Mining Cross-Modal Semantics [68.88086621181628]
We propose a novel approach by mining the Cross-Modal Semantics to guide the fusion and decoding of multimodal features.
Specifically, we propose a novel network, termed XMSNet, consisting of (1) all-round attentive fusion (AF), (2) coarse-to-fine decoder (CFD), and (3) cross-layer self-supervision.
arXiv Detail & Related papers (2023-05-17T14:30:11Z) - AdaStereo: An Efficient Domain-Adaptive Stereo Matching Approach [50.855679274530615]
We present a novel domain-adaptive approach called AdaStereo to align multi-level representations for deep stereo matching networks.
Our models achieve state-of-the-art cross-domain performance on multiple benchmarks, including KITTI, Middlebury, ETH3D and DrivingStereo.
Our method is robust to various domain adaptation settings, and can be easily integrated into quick adaptation application scenarios and real-world deployments.
arXiv Detail & Related papers (2021-12-09T15:10:47Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning [24.13425816781179]
Local feature extraction remains an active research area due to the advances in fields such as SLAM, 3D reconstructions, or AR applications.
We propose a method that treats both extractions independently and focuses on their interaction in the learning process.
We show improvements over the state of the art in terms of image matching on HPatches and 3D reconstruction quality while keeping on par on camera localisation tasks.
arXiv Detail & Related papers (2020-05-12T13:55:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.