Related papers: E3CM: Epipolar-Constrained Cascade Correspondence Matching

E3CM: Epipolar-Constrained Cascade Correspondence Matching

URL: http://arxiv.org/abs/2308.16555v1
Date: Thu, 31 Aug 2023 08:46:12 GMT
Title: E3CM: Epipolar-Constrained Cascade Correspondence Matching
Authors: Chenbo Zhou, Shuai Su, Qijun Chen, Rui Fan
Abstract summary: We introduce Epipolar-Constrained Cascade Correspondence (E3CM) as a novel explicit programming-based method. Unlike traditional methods, E3CM leverages pre-trained convolutional neural networks to match correspondence. We extensively evaluate the performance of E3CM through comprehensive experiments and demonstrate its superiority over existing methods.
Score: 19.650006628979355
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate and robust correspondence matching is of utmost importance for various 3D computer vision tasks. However, traditional explicit programming-based methods often struggle to handle challenging scenarios, and deep learning-based methods require large well-labeled datasets for network training. In this article, we introduce Epipolar-Constrained Cascade Correspondence (E3CM), a novel approach that addresses these limitations. Unlike traditional methods, E3CM leverages pre-trained convolutional neural networks to match correspondence, without requiring annotated data for any network training or fine-tuning. Our method utilizes epipolar constraints to guide the matching process and incorporates a cascade structure for progressive refinement of matches. We extensively evaluate the performance of E3CM through comprehensive experiments and demonstrate its superiority over existing methods. To promote further research and facilitate reproducibility, we make our source code publicly available at https://mias.group/E3CM.

Related papers

MeshConv3D: Efficient convolution and pooling operators for triangular 3D meshes [0.0]
MeshConv3D is a 3D mesh-dedicated methodology integrating specialized convolution and face collapse-based pooling operators. The experimental results obtained on three distinct benchmark datasets show that the proposed approach makes it possible to achieve equivalent or superior classification results.
arXiv Detail & Related papers (2025-01-07T14:41:26Z)
Is Contrastive Distillation Enough for Learning Comprehensive 3D Representations? [55.99654128127689]
Cross-modal contrastive distillation has recently been explored for learning effective 3D representations. Existing methods focus primarily on modality-shared features, neglecting the modality-specific features during the pre-training process. We propose a new framework, namely CMCR, to address these shortcomings.
arXiv Detail & Related papers (2024-12-12T06:09:49Z)
Let Me DeCode You: Decoder Conditioning with Tabular Data [0.15487122608774898]
We introduce a novel approach, DeCode, that utilizes label-derived features for model conditioning to support the decoder in the reconstruction process dynamically. DeCode focuses on improving 3D segmentation performance through the incorporation of conditioning embedding with learned numerical representation of 3D-label shape features. Our results show that DeCode significantly outperforms traditional, unconditioned models in terms of generalization to unseen data, achieving higher accuracy at a reduced computational cost.
arXiv Detail & Related papers (2024-07-12T17:14:33Z)
Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL) GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval. Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z)
Hierarchical Insights: Exploiting Structural Similarities for Reliable 3D Semantic Segmentation [4.480310276450028]
We propose a training strategy for a 3D LiDAR semantic segmentation model that learns structural relationships between classes through abstraction. This is achieved by implicitly modeling these relationships using a learning rule for hierarchical multi-label classification (HMC) Our detailed analysis demonstrates that this training strategy not only improves the model's confidence calibration but also retains additional information useful for downstream tasks such as fusion, prediction, and planning.
arXiv Detail & Related papers (2024-04-09T08:49:01Z)
Generalized Label-Efficient 3D Scene Parsing via Hierarchical Feature Aligned Pre-Training and Region-Aware Fine-tuning [55.517000360348725]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited. To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy. Experiments with both indoor and outdoor scenes demonstrated the effectiveness of our approach in both data-efficient learning and open-world few-shot learning.
arXiv Detail & Related papers (2023-12-01T15:47:04Z)
Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes. Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models. Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z)
MetaGater: Fast Learning of Conditional Channel Gated Networks via Federated Meta-Learning [46.79356071007187]
We propose a holistic approach to jointly train the backbone network and the channel gating. We develop a federated meta-learning approach to jointly learn good meta-initializations for both backbone networks and gating modules.
arXiv Detail & Related papers (2020-11-25T04:26:23Z)
FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation [87.74617110803189]
Estimating the 3D motion of points in a scene, known as scene flow, is a core problem in computer vision. We present a recurrent architecture that learns a single step of an unrolled iterative alignment procedure for refining scene flow predictions.
arXiv Detail & Related papers (2020-11-19T23:23:48Z)
Gram Regularization for Multi-view 3D Shape Retrieval [3.655021726150368]
We propose a novel regularization term called Gram regularization. By forcing the variance between weight kernels to be large, the regularizer can help to extract discriminative features. The proposed Gram regularization is data independent and can converge stably and quickly without bells and whistles.
arXiv Detail & Related papers (2020-11-16T05:37:24Z)
MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate. We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network. Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)
Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation. In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI. We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.