Multi-task Geometric Estimation of Depth and Surface Normal from Monocular 360° Images
- URL: http://arxiv.org/abs/2411.01749v1
- Date: Mon, 04 Nov 2024 02:20:22 GMT
- Title: Multi-task Geometric Estimation of Depth and Surface Normal from Monocular 360° Images
- Authors: Kun Huang, Fang-Lue Zhang, Fangfang Zhang, Yu-Kun Lai, Paul Rosin, Neil A. Dodgson,
- Abstract summary: We introduce a novel multi-task learning (MTL) network that simultaneously estimates depth and surface normals from 360deg images.
Experimental results demonstrate that our MTL architecture significantly outperforms state-of-the-art methods in both depth and surface normal estimation.
Our model's effectiveness and generalizability, particularly in handling intricate surface textures, establish it as a new benchmark in 360deg image geometric estimation.
- Score: 45.051325655043634
- License:
- Abstract: Geometric estimation is required for scene understanding and analysis in panoramic 360{\deg} images. Current methods usually predict a single feature, such as depth or surface normal. These methods can lack robustness, especially when dealing with intricate textures or complex object surfaces. We introduce a novel multi-task learning (MTL) network that simultaneously estimates depth and surface normals from 360{\deg} images. Our first innovation is our MTL architecture, which enhances predictions for both tasks by integrating geometric information from depth and surface normal estimation, enabling a deeper understanding of 3D scene structure. Another innovation is our fusion module, which bridges the two tasks, allowing the network to learn shared representations that improve accuracy and robustness. Experimental results demonstrate that our MTL architecture significantly outperforms state-of-the-art methods in both depth and surface normal estimation, showing superior performance in complex and diverse scenes. Our model's effectiveness and generalizability, particularly in handling intricate surface textures, establish it as a new benchmark in 360{\deg} image geometric estimation. The code and model are available at \url{https://github.com/huangkun101230/360MTLGeometricEstimation}.
Related papers
- SuperPrimitive: Scene Reconstruction at a Primitive Level [23.934492494774116]
Joint camera pose and dense geometry estimation from a set of images or a monocular video remains a challenging problem.
Most dense incremental reconstruction systems operate directly on image pixels and solve for their 3D positions using multi-view geometry cues.
We address this issue with a new image representation which we call a SuperPrimitive.
arXiv Detail & Related papers (2023-12-10T13:44:03Z) - Surface Geometry Processing: An Efficient Normal-based Detail
Representation [66.69000350849328]
We introduce an efficient surface detail processing framework in 2D normal domain.
We show that the proposed normal-based representation has three important properties, including detail separability, detail transferability and detail idempotence.
Three new schemes are further designed for geometric surface detail processing applications, including geometric texture synthesis, geometry detail transfer, and 3D surface super-resolution.
arXiv Detail & Related papers (2023-07-16T04:46:32Z) - Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object
Tracking [21.74515335906769]
We develop a multi-modality tracker that fuses information from visual appearance and geometry to estimate object poses.
The algorithm extends our previous method ICG, which uses geometry, to additionally consider surface appearance.
arXiv Detail & Related papers (2023-02-22T15:53:00Z) - Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo [103.08512487830669]
We present a modern solution to the multi-view photometric stereo problem (MVPS)
We procure the surface orientation using a photometric stereo (PS) image formation model and blend it with a multi-view neural radiance field representation to recover the object's surface geometry.
Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.
arXiv Detail & Related papers (2021-10-11T20:20:03Z) - Self-supervised Depth Estimation Leveraging Global Perception and
Geometric Smoothness Using On-board Videos [0.5276232626689566]
We present DLNet for pixel-wise depth estimation, which simultaneously extracts global and local features.
A three-dimensional geometry smoothness loss is proposed to predict a geometrically natural depth map.
In experiments on the KITTI and Make3D benchmarks, the proposed DLNet achieves performance competitive to those of the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-07T10:53:27Z) - GeoNet++: Iterative Geometric Neural Network with Edge-Aware Refinement
for Joint Depth and Surface Normal Estimation [204.13451624763735]
We propose a geometric neural network with edge-aware refinement (GeoNet++) to jointly predict both depth and surface normal maps from a single image.
GeoNet++ effectively predicts depth and surface normals with strong 3D consistency and sharp boundaries.
In contrast to current metrics that focus on evaluating pixel-wise error/accuracy, 3DGM measures whether the predicted depth can reconstruct high-quality 3D surface normals.
arXiv Detail & Related papers (2020-12-13T06:48:01Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images [64.53227129573293]
We investigate the problem of learning to generate 3D parametric surface representations for novel object instances, as seen from one or more views.
We design neural networks capable of generating high-quality parametric 3D surfaces which are consistent between views.
Our method is supervised and trained on a public dataset of shapes from common object categories.
arXiv Detail & Related papers (2020-08-18T06:33:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.