Multi-view dense image matching with similarity learning and geometry priors
- URL: http://arxiv.org/abs/2505.11264v1
- Date: Fri, 16 May 2025 13:55:40 GMT
- Title: Multi-view dense image matching with similarity learning and geometry priors
- Authors: Mohamed Ali Chebbi, Ewelina Rupnik, Paul Lopes, Marc Pierrot-Deseilligny,
- Abstract summary: MV-DeepSimNets is a suite of deep neural networks designed for multi-view similarity learning.<n>Our approach incorporates an online geometry prior to characterize pixel relationships.<n>Our method geometric preconditioning effectively adapts epipolar-based features for enhanced multi-view reconstruction.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce MV-DeepSimNets, a comprehensive suite of deep neural networks designed for multi-view similarity learning, leveraging epipolar geometry for training. Our approach incorporates an online geometry prior to characterize pixel relationships, either along the epipolar line or through homography rectification. This enables the generation of geometry-aware features from native images, which are then projected across candidate depth hypotheses using plane sweeping. Our method geometric preconditioning effectively adapts epipolar-based features for enhanced multi-view reconstruction, without requiring the laborious multi-view training dataset creation. By aggregating learned similarities, we construct and regularize the cost volume, leading to improved multi-view surface reconstruction over traditional dense matching approaches. MV-DeepSimNets demonstrates superior performance against leading similarity learning networks and end-to-end regression models, especially in terms of generalization capabilities across both aerial and satellite imagery with varied ground sampling distances. Our pipeline is integrated into MicMac software and can be readily adopted in standard multi-resolution image matching pipelines.
Related papers
- Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation [62.87088388345378]
We introduce a diffusion-based framework that performs aligned novel view image and geometry generation via a warping-and-inpainting methodology.<n>Method leverages off-the-shelf geometry predictors to predict partial geometries viewed from reference images.<n>Cross-modal attention distillation is proposed to ensure accurate alignment between generated images and geometry.
arXiv Detail & Related papers (2025-06-13T16:19:00Z) - Deep Learning Reforms Image Matching: A Survey and Outlook [38.104899835728574]
Image matching serves as a cornerstone in computer vision and underpins a wide range of applications.<n>Recent deep learning advances have significantly boosted both robustness and accuracy.<n>This survey adopts a unique perspective by comprehensively reviewing how deep learning has incrementally transformed the classical image matching pipeline.
arXiv Detail & Related papers (2025-06-05T04:25:22Z) - Blending 3D Geometry and Machine Learning for Multi-View Stereopsis [3.259672998844162]
GC MVSNet plus plus is a novel approach to enforce multi-view, multi-scale supervised geometric consistency during learning.<n>This integrated GC check significantly accelerates the learning process by directly penalizing geometrically inconsistent pixels.<n>Our approach achieves a new state of the art on the DTU and BlendedMVS datasets and secures second place on the Tanks and Temples benchmark.
arXiv Detail & Related papers (2025-05-06T12:22:45Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - DeepSim-Nets: Deep Similarity Networks for Stereo Image Matching [0.0]
We present three multi-scale similarity learning architectures, or DeepSim networks.
These models learn pixel-level matching with a contrastive loss and are agnostic to the geometry of the considered scene.
We establish a middle ground between hybrid and end-to-end approaches by learning to densely allocate all corresponding pixels of an epipolar pair at once.
arXiv Detail & Related papers (2023-04-17T08:15:47Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - Geo-Neus: Geometry-Consistent Neural Implicit Surfaces Learning for
Multi-view Reconstruction [41.43563122590449]
We propose geometry-consistent neural implicit surfaces learning for multi-view reconstruction.
Our proposed method achieves high-quality surface reconstruction in both complex thin structures and large smooth regions.
arXiv Detail & Related papers (2022-05-31T14:52:07Z) - Weak Multi-View Supervision for Surface Mapping Estimation [0.9367260794056769]
We propose a weakly-supervised multi-view learning approach to learn category-specific surface mapping without dense annotations.
We learn the underlying surface geometry of common categories, such as human faces, cars, and airplanes, given instances from those categories.
arXiv Detail & Related papers (2021-05-04T09:46:26Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Continual Adaptation for Deep Stereo [52.181067640300014]
We propose a continual adaptation paradigm for deep stereo networks designed to deal with challenging and ever-changing environments.
In our paradigm, the learning signals needed to continuously adapt models online can be sourced from self-supervision via right-to-left image warping or from traditional stereo algorithms.
Our network architecture and adaptation algorithms realize the first real-time self-adaptive deep stereo system.
arXiv Detail & Related papers (2020-07-10T08:15:58Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.