DVP-MVS++: Synergize Depth-Normal-Edge and Harmonized Visibility Prior for Multi-View Stereo
- URL: http://arxiv.org/abs/2506.13215v1
- Date: Mon, 16 Jun 2025 08:15:22 GMT
- Title: DVP-MVS++: Synergize Depth-Normal-Edge and Harmonized Visibility Prior for Multi-View Stereo
- Authors: Zhenlong Yuan, Dapeng Zhang, Zehao Li, Chengxuan Qian, Jianing Chen, Yinda Chen, Kehua Chen, Tianlu Mao, Zhaoxin Li, Hao Jiang, Zhaoqi Wang,
- Abstract summary: We propose DVP-MVS++, an innovative approach that synergizes both depth-normal-edge aligned and harmonized cross-view priors for robust and visibility-aware patch deformation.<n> Evaluation results on ETH3D, Tanks & Temples and Strecha datasets exhibit the state-of-the-art performance and robust generalization capability of our proposed method.
- Score: 7.544716770845737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, patch deformation-based methods have demonstrated significant effectiveness in multi-view stereo due to their incorporation of deformable and expandable perception for reconstructing textureless areas. However, these methods generally focus on identifying reliable pixel correlations to mitigate matching ambiguity of patch deformation, while neglecting the deformation instability caused by edge-skipping and visibility occlusions, which may cause potential estimation deviations. To address these issues, we propose DVP-MVS++, an innovative approach that synergizes both depth-normal-edge aligned and harmonized cross-view priors for robust and visibility-aware patch deformation. Specifically, to avoid edge-skipping, we first apply DepthPro, Metric3Dv2 and Roberts operator to generate coarse depth maps, normal maps and edge maps, respectively. These maps are then aligned via an erosion-dilation strategy to produce fine-grained homogeneous boundaries for facilitating robust patch deformation. Moreover, we reformulate view selection weights as visibility maps, and then implement both an enhanced cross-view depth reprojection and an area-maximization strategy to help reliably restore visible areas and effectively balance deformed patch, thus acquiring harmonized cross-view priors for visibility-aware patch deformation. Additionally, we obtain geometry consistency by adopting both aggregated normals via view selection and projection depth differences via epipolar lines, and then employ SHIQ for highlight correction to enable geometry consistency with highlight-aware perception, thus improving reconstruction quality during propagation and refinement stage. Evaluation results on ETH3D, Tanks & Temples and Strecha datasets exhibit the state-of-the-art performance and robust generalization capability of our proposed method.
Related papers
- SED-MVS: Segmentation-Driven and Edge-Aligned Deformation Multi-View Stereo with Depth Restoration and Occlusion Constraint [11.165686149180054]
We propose SED-MVS, which adopts panoptic segmentation and multi-trajectory diffusion strategy for segmentation-driven and edge-aligned patch deformation.<n>Specifically, to prevent unanticipated edge-skipping, we first employ SAM2 for panoptic segmentation as depth-edge guidance to guide patch deformation, followed by multi-trajectory diffusion strategy to ensure patches are comprehensively aligned with depth edges.
arXiv Detail & Related papers (2025-03-17T21:07:44Z) - DVP-MVS: Synergize Depth-Edge and Visibility Prior for Multi-View Stereo [8.303396507129266]
We propose DVP-MVS, which synergizes depth-edge aligned and cross-view prior for robust and visibility-aware patch deformation.<n>Our method can achieve state-of-the-art performance with excellent robustness and generalization.
arXiv Detail & Related papers (2024-12-16T09:09:10Z) - MSP-MVS: Multi-Granularity Segmentation Prior Guided Multi-View Stereo [8.303396507129266]
MSP-MVS is a method introducing multi-granularity segmentation prior to edge-confined patch deformation.<n>We implement equidistribution and disassemble-clustering of correlative reliable pixels.<n>We also introduce disparity-sampling synergistic 3D optimization to help identify global-minimum matching costs.
arXiv Detail & Related papers (2024-07-27T19:00:44Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.<n>Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - SD-MVS: Segmentation-Driven Deformation Multi-View Stereo with Spherical
Refinement and EM optimization [6.886220026399106]
We introduce Multi-View Stereo (SD-MVS) to tackle challenges in 3D reconstruction of textureless areas.
We are the first to adopt the Segment Anything Model (SAM) to distinguish semantic instances in scenes.
We propose a unique refinement strategy that combines spherical coordinates and gradient descent on normals and pixelwise search interval on depths.
arXiv Detail & Related papers (2024-01-12T05:25:57Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics.
The proposed method addresses limitations in existing cross-view localization methods.
It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z) - Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain.
GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors.
We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z) - Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique.
First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.
To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z) - On Robust Cross-View Consistency in Self-Supervised Monocular Depth Estimation [56.97699793236174]
We study two kinds of robust cross-view consistency in this paper.
We exploit the temporal coherence in both depth feature space and 3D voxel space for self-supervised monocular depth estimation.
Experimental results on several outdoor benchmarks show that our method outperforms current state-of-the-art techniques.
arXiv Detail & Related papers (2022-09-19T03:46:13Z) - PatchMVSNet: Patch-wise Unsupervised Multi-View Stereo for
Weakly-Textured Surface Reconstruction [2.9896482273918434]
This paper proposes robust loss functions leveraging constraints beneath multi-view images to alleviate matching ambiguity.
Our strategy can be implemented with arbitrary depth estimation frameworks and can be trained with arbitrary large-scale MVS datasets.
Our method reaches the performance of the state-of-the-art methods on popular benchmarks, like DTU, Tanks and Temples and ETH3D.
arXiv Detail & Related papers (2022-03-04T07:05:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.