Related papers: 360MonoDepth: High-Resolution 360{\deg} Monocular Depth Estimation

360MonoDepth: High-Resolution 360{\deg} Monocular Depth Estimation

URL: http://arxiv.org/abs/2111.15669v1
Date: Tue, 30 Nov 2021 18:57:29 GMT
Title: 360MonoDepth: High-Resolution 360{\deg} Monocular Depth Estimation
Authors: Manuel Rey-Area and Mingze Yuan and Christian Richardt
Abstract summary: monocular depth estimation remains a challenge for 360deg data. Current CNN-based methods do not support such high resolutions due to limited GPU memory. We propose a flexible framework for monocular depth estimation from high-resolution 360deg images using tangent images.
Score: 15.65828728205071
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 360{\deg} cameras can capture complete environments in a single shot, which makes 360{\deg} imagery alluring in many computer vision tasks. However, monocular depth estimation remains a challenge for 360{\deg} data, particularly for high resolutions like 2K (2048$\times$1024) that are important for novel-view synthesis and virtual reality applications. Current CNN-based methods do not support such high resolutions due to limited GPU memory. In this work, we propose a flexible framework for monocular depth estimation from high-resolution 360{\deg} images using tangent images. We project the 360{\deg} input image onto a set of tangent planes that produce perspective views, which are suitable for the latest, most accurate state-of-the-art perspective monocular depth estimators. We recombine the individual depth estimates using deformable multi-scale alignment followed by gradient-domain blending to improve the consistency of disparity estimates. The result is a dense, high-resolution 360{\deg} depth map with a high level of detail, also for outdoor scenes which are not supported by existing methods.

Related papers

AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting [15.177483700681377]
Three-dimensional scene inpainting is crucial for applications from virtual reality to architectural visualization. We present AuraFusion360, a novel reference-based method that enables high-quality object removal and hole filling in 3D scenes represented by Gaussian Splatting. We also introduce 360-USID, the first comprehensive dataset for 360deg scene inpainting with ground truth.
arXiv Detail & Related papers (2025-02-07T18:59:55Z)
Splatter-360: Generalizable 360$^{\circ}$ Gaussian Splatting for Wide-baseline Panoramic Images [52.48351378615057]
textitSplatter-360 is a novel end-to-end generalizable 3DGS framework to handle wide-baseline panoramic images. We introduce a 3D-aware bi-projection encoder to mitigate the distortions inherent in panoramic images. This enables robust 3D-aware feature representations and real-time rendering capabilities.
arXiv Detail & Related papers (2024-12-09T06:58:31Z)
Align3R: Aligned Monocular Depth Estimation for Dynamic Videos [50.28715151619659]
We propose a novel video-depth estimation method called Align3R to estimate temporal consistent depth maps for a dynamic video. Our key idea is to utilize the recent DUSt3R model to align estimated monocular depth maps of different timesteps. Experiments demonstrate that Align3R estimates consistent video depth and camera poses for a monocular video with superior performance than baseline methods.
arXiv Detail & Related papers (2024-12-04T07:09:59Z)
Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation [83.841877607646]
We introduce Helvipad, a real-world dataset for omnidirectional stereo depth estimation. The dataset includes accurate depth and disparity labels by projecting 3D point clouds onto equirectangular images. We benchmark leading stereo depth estimation models for both standard and omnidirectional images.
arXiv Detail & Related papers (2024-11-27T13:34:41Z)
Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation [6.832852988957967]
We propose a new depth estimation framework that utilizes unlabeled 360-degree data effectively. Our approach uses state-of-the-art perspective depth estimation models as teacher models to generate pseudo labels. We tested our approach on benchmark datasets such as Matterport3D and Stanford2D3D, showing significant improvements in depth estimation accuracy.
arXiv Detail & Related papers (2024-06-18T17:59:31Z)
Sp2360: Sparse-view 360 Scene Reconstruction using Cascaded 2D Diffusion Priors [51.36238367193988]
We tackle sparse-view reconstruction of a 360 3D scene using priors from latent diffusion models (LDM) We present SparseSplat360, a method that employs a cascade of in-painting and artifact removal models to fill in missing details and clean novel views. Our method generates entire 360 scenes from as few as 9 input views, with a high degree of foreground and background detail.
arXiv Detail & Related papers (2024-05-26T11:01:39Z)
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting [56.101576795566324]
We present a text-to-3D 360$circ$ scene generation pipeline. Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement. Our method offers a globally consistent 3D scene within a 360$circ$ perspective.
arXiv Detail & Related papers (2024-04-10T10:46:59Z)
NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes [59.15910989235392]
We introduce NeO 360, Neural fields for sparse view synthesis of outdoor scenes. NeO 360 is a generalizable method that reconstructs 360deg scenes from a single or a few posed RGB images. Our representation combines the best of both voxel-based and bird's-eye-view (BEV) representations.
arXiv Detail & Related papers (2023-08-24T17:59:50Z)
High-Resolution Depth Estimation for 360-degree Panoramas through Perspective and Panoramic Depth Images Registration [3.4583104874165804]
We propose a novel approach to compute high-resolution (2048x1024 and higher) depths for panoramas. Our method generates qualitatively better results than existing panorama-based methods, and further outperforms them quantitatively on datasets unseen by these methods.
arXiv Detail & Related papers (2022-10-19T09:25:12Z)
Field-of-View IoU for Object Detection in 360{\deg} Images [36.72543749626039]
We propose two fundamental techniques -- Field-of-View IoU (FoV-IoU) and 360Augmentation for object detection in 360deg images. FoV-IoU computes the intersection-over-union of two Field-of-View bounding boxes in a spherical image which could be used for training, inference, and evaluation. 360Augmentation is a data augmentation technique specific to 360deg object detection task which randomly rotates a spherical image and solves the bias due to the sphere-to-plane projection.
arXiv Detail & Related papers (2022-02-07T14:01:59Z)
Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth [90.33296913575818]
In some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency. We propose a locally weighted linear regression method to recover the scale and shift with very sparse anchor points. Our method can boost the performance of existing state-of-the-art approaches by 50% at most over several zero-shot benchmarks.
arXiv Detail & Related papers (2022-02-03T08:52:54Z)
LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama. We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable. Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z)
360$^\circ$ Depth Estimation from Multiple Fisheye Images with Origami Crown Representation of Icosahedron [5.384800591054856]
We propose a new icosahedron-based representation and ConvNets for omnidirectional images. CrownConv can be applied to both fisheye images and equirectangular images to extract features. As our proposed method is computationally efficient, the depth is estimated from four fisheye images in less than a second using a laptop with a GPU.
arXiv Detail & Related papers (2020-07-14T08:02:53Z)
A Fixation-based 360{\deg} Benchmark Dataset for Salient Object Detection [21.314578493964333]
Fixation prediction (FP) in panoramic contents has been widely investigated along with the booming trend of virtual reality (VR) applications. salient object detection (SOD) has been seldom explored in 360deg images due to the lack of datasets representative of real scenes.
arXiv Detail & Related papers (2020-01-22T11:16:39Z)
Visual Question Answering on 360{\deg} Images [96.00046925811515]
VQA 360 is a novel task of visual question answering on 360 images. We collect the first VQA 360 dataset, containing around 17,000 real-world image-question-answer triplets for a variety of question types.
arXiv Detail & Related papers (2020-01-10T08:18:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.