Detail-aware multi-view stereo network for depth estimation
- URL: http://arxiv.org/abs/2503.23684v1
- Date: Mon, 31 Mar 2025 03:23:39 GMT
- Title: Detail-aware multi-view stereo network for depth estimation
- Authors: Haitao Tian, Junyang Li, Chenxing Wang, Helong Jiang,
- Abstract summary: We propose a detail-aware multi-view stereo network (DA-MVSNet) with a coarse-to-fine framework.<n>The geometric depth clues hidden in the coarse stage are utilized to maintain the geometric structural relationships.<n>Experiments on the DTU and Tanks & Temples datasets demonstrate that our method achieves competitive results.
- Score: 4.8203572077041335
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-view stereo methods have achieved great success for depth estimation based on the coarse-to-fine depth learning frameworks, however, the existing methods perform poorly in recovering the depth of object boundaries and detail regions. To address these issues, we propose a detail-aware multi-view stereo network (DA-MVSNet) with a coarse-to-fine framework. The geometric depth clues hidden in the coarse stage are utilized to maintain the geometric structural relationships between object surfaces and enhance the expressive capability of image features. In addition, an image synthesis loss is employed to constrain the gradient flow for detailed regions and further strengthen the supervision of object boundaries and texture-rich areas. Finally, we propose an adaptive depth interval adjustment strategy to improve the accuracy of object reconstruction. Extensive experiments on the DTU and Tanks & Temples datasets demonstrate that our method achieves competitive results. The code is available at https://github.com/wsmtht520-/DAMVSNet.
Related papers
- Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution [55.9977636042469]
We propose a novel framework, termed geometry-decoupled network (GDNet), for compressed depth map super-resolution.<n>It decouples the high-quality depth map reconstruction process by handling global and detailed geometric features separately.<n>Our solution significantly outperforms current methods in terms of geometric consistency and detail recovery.
arXiv Detail & Related papers (2024-11-05T16:37:30Z) - Depth-guided Texture Diffusion for Image Semantic Segmentation [47.46257473475867]
We introduce a Depth-guided Texture Diffusion approach that effectively tackles the outlined challenge.
Our method extracts low-level features from edges and textures to create a texture image.
By integrating this enriched depth map with the original RGB image into a joint feature embedding, our method effectively bridges the disparity between the depth map and the image.
arXiv Detail & Related papers (2024-08-17T04:55:03Z) - Depth-aware Volume Attention for Texture-less Stereo Matching [67.46404479356896]
We propose a lightweight volume refinement scheme to tackle the texture deterioration in practical outdoor scenarios.
We introduce a depth volume supervised by the ground-truth depth map, capturing the relative hierarchy of image texture.
Local fine structure and context are emphasized to mitigate ambiguity and redundancy during volume aggregation.
arXiv Detail & Related papers (2024-02-14T04:07:44Z) - V-FUSE: Volumetric Depth Map Fusion with Long-Range Constraints [6.7197802356130465]
We introduce a learning-based depth map fusion framework that accepts a set of depth and confidence maps generated by a Multi-View Stereo (MVS) algorithm as input and improves them.
We also introduce a depth search window estimation sub-network trained jointly with the larger fusion sub-network to reduce the depth hypothesis search space along each ray.
Our method learns to model depth consensus and violations of visibility constraints directly from the data.
arXiv Detail & Related papers (2023-08-17T00:39:56Z) - Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth
Approach with Saddle-shaped Depth Cells [23.345139129458122]
We show that different depth geometries have significant performance gaps, even using the same depth prediction error.
We introduce an ideal depth geometry composed of Saddle-Shaped Cells, whose predicted depth map oscillates upward and downward around the ground-truth surface.
Our method also points to a new research direction for considering depth geometry in MVS.
arXiv Detail & Related papers (2023-07-18T11:37:53Z) - Visual Attention-based Self-supervised Absolute Depth Estimation using
Geometric Priors in Autonomous Driving [8.045833295463094]
We introduce a fully Visual Attention-based Depth (VADepth) network, where spatial attention and channel attention are applied to all stages.
By continuously extracting the dependencies of features along the spatial and channel dimensions over a long distance, VADepth network can effectively preserve important details.
Experimental results on the KITTI dataset show that this architecture achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-05-18T08:01:38Z) - Struct-MDC: Mesh-Refined Unsupervised Depth Completion Leveraging
Structural Regularities from Visual SLAM [1.8899300124593648]
Feature-based visual simultaneous localization and mapping (SLAM) methods only estimate the depth of extracted features.
depth completion tasks that estimate a dense depth from a sparse depth have gained significant importance in robotic applications like exploration.
We propose a mesh depth refinement (MDR) module to address this problem.
The Struct-MDC outperforms other state-of-the-art algorithms on public and our custom datasets.
arXiv Detail & Related papers (2022-04-29T04:29:17Z) - Improving Monocular Visual Odometry Using Learned Depth [84.05081552443693]
We propose a framework to exploit monocular depth estimation for improving visual odometry (VO)
The core of our framework is a monocular depth estimation module with a strong generalization capability for diverse scenes.
Compared with current learning-based VO methods, our method demonstrates a stronger generalization ability to diverse scenes.
arXiv Detail & Related papers (2022-04-04T06:26:46Z) - 3DVNet: Multi-View Depth Prediction and Volumetric Refinement [68.68537312256144]
3DVNet is a novel multi-view stereo (MVS) depth-prediction method.
Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions.
We show that our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics.
arXiv Detail & Related papers (2021-12-01T00:52:42Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.