E3CM: Epipolar-Constrained Cascade Correspondence Matching
- URL: http://arxiv.org/abs/2308.16555v1
- Date: Thu, 31 Aug 2023 08:46:12 GMT
- Title: E3CM: Epipolar-Constrained Cascade Correspondence Matching
- Authors: Chenbo Zhou, Shuai Su, Qijun Chen, Rui Fan
- Abstract summary: We introduce Epipolar-Constrained Cascade Correspondence (E3CM) as a novel explicit programming-based method.
Unlike traditional methods, E3CM leverages pre-trained convolutional neural networks to match correspondence.
We extensively evaluate the performance of E3CM through comprehensive experiments and demonstrate its superiority over existing methods.
- Score: 19.650006628979355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate and robust correspondence matching is of utmost importance for
various 3D computer vision tasks. However, traditional explicit
programming-based methods often struggle to handle challenging scenarios, and
deep learning-based methods require large well-labeled datasets for network
training. In this article, we introduce Epipolar-Constrained Cascade
Correspondence (E3CM), a novel approach that addresses these limitations.
Unlike traditional methods, E3CM leverages pre-trained convolutional neural
networks to match correspondence, without requiring annotated data for any
network training or fine-tuning. Our method utilizes epipolar constraints to
guide the matching process and incorporates a cascade structure for progressive
refinement of matches. We extensively evaluate the performance of E3CM through
comprehensive experiments and demonstrate its superiority over existing
methods. To promote further research and facilitate reproducibility, we make
our source code publicly available at https://mias.group/E3CM.
Related papers
- MeshConv3D: Efficient convolution and pooling operators for triangular 3D meshes [0.0]
MeshConv3D is a 3D mesh-dedicated methodology integrating specialized convolution and face collapse-based pooling operators.
The experimental results obtained on three distinct benchmark datasets show that the proposed approach makes it possible to achieve equivalent or superior classification results.
arXiv Detail & Related papers (2025-01-07T14:41:26Z) - Is Contrastive Distillation Enough for Learning Comprehensive 3D Representations? [55.99654128127689]
Cross-modal contrastive distillation has recently been explored for learning effective 3D representations.
Existing methods focus primarily on modality-shared features, neglecting the modality-specific features during the pre-training process.
We propose a new framework, namely CMCR, to address these shortcomings.
arXiv Detail & Related papers (2024-12-12T06:09:49Z) - Let Me DeCode You: Decoder Conditioning with Tabular Data [0.15487122608774898]
We introduce a novel approach, DeCode, that utilizes label-derived features for model conditioning to support the decoder in the reconstruction process dynamically.
DeCode focuses on improving 3D segmentation performance through the incorporation of conditioning embedding with learned numerical representation of 3D-label shape features.
Our results show that DeCode significantly outperforms traditional, unconditioned models in terms of generalization to unseen data, achieving higher accuracy at a reduced computational cost.
arXiv Detail & Related papers (2024-07-12T17:14:33Z) - Hierarchical Insights: Exploiting Structural Similarities for Reliable 3D Semantic Segmentation [4.480310276450028]
We propose a training strategy for a 3D LiDAR semantic segmentation model that learns structural relationships between classes through abstraction.
This is achieved by implicitly modeling these relationships using a learning rule for hierarchical multi-label classification (HMC)
Our detailed analysis demonstrates that this training strategy not only improves the model's confidence calibration but also retains additional information useful for downstream tasks such as fusion, prediction, and planning.
arXiv Detail & Related papers (2024-04-09T08:49:01Z) - Generalized Robot 3D Vision-Language Model with Fast Rendering and Pre-Training Vision-Language Alignment [55.11291053011696]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited.
To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy.
In the limited reconstruction case, our proposed approach, termed WS3D++, ranks 1st on the large-scale ScanNet benchmark.
arXiv Detail & Related papers (2023-12-01T15:47:04Z) - Boosting Low-Data Instance Segmentation by Unsupervised Pre-training
with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes.
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models.
Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z) - MetaGater: Fast Learning of Conditional Channel Gated Networks via
Federated Meta-Learning [46.79356071007187]
We propose a holistic approach to jointly train the backbone network and the channel gating.
We develop a federated meta-learning approach to jointly learn good meta-initializations for both backbone networks and gating modules.
arXiv Detail & Related papers (2020-11-25T04:26:23Z) - FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation [87.74617110803189]
Estimating the 3D motion of points in a scene, known as scene flow, is a core problem in computer vision.
We present a recurrent architecture that learns a single step of an unrolled iterative alignment procedure for refining scene flow predictions.
arXiv Detail & Related papers (2020-11-19T23:23:48Z) - Gram Regularization for Multi-view 3D Shape Retrieval [3.655021726150368]
We propose a novel regularization term called Gram regularization.
By forcing the variance between weight kernels to be large, the regularizer can help to extract discriminative features.
The proposed Gram regularization is data independent and can converge stably and quickly without bells and whistles.
arXiv Detail & Related papers (2020-11-16T05:37:24Z) - MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT
Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate.
We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network.
Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z) - Unpaired Multi-modal Segmentation via Knowledge Distillation [77.39798870702174]
We propose a novel learning scheme for unpaired cross-modality image segmentation.
In our method, we heavily reuse network parameters, by sharing all convolutional kernels across CT and MRI.
We have extensively validated our approach on two multi-class segmentation problems.
arXiv Detail & Related papers (2020-01-06T20:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.