RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization
- URL: http://arxiv.org/abs/2509.23991v1
- Date: Sun, 28 Sep 2025 17:33:12 GMT
- Title: RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization
- Authors: Dongki Jung, Jaehoon Choi, Yonghan Lee, Dinesh Manocha,
- Abstract summary: RPG360 is a training-free robust 360 monocular depth estimation method.<n>We introduce a novel depth scale alignment technique using graph-based optimization.<n>Our method achieves superior performance across diverse datasets, including Matterport3D, Stanford2D3D, and 360Loc.
- Score: 48.99932182976206
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasing use of 360 images across various domains has emphasized the need for robust depth estimation techniques tailored for omnidirectional images. However, obtaining large-scale labeled datasets for 360 depth estimation remains a significant challenge. In this paper, we propose RPG360, a training-free robust 360 monocular depth estimation method that leverages perspective foundation models and graph optimization. Our approach converts 360 images into six-face cubemap representations, where a perspective foundation model is employed to estimate depth and surface normals. To address depth scale inconsistencies across different faces of the cubemap, we introduce a novel depth scale alignment technique using graph-based optimization, which parameterizes the predicted depth and normal maps while incorporating an additional per-face scale parameter. This optimization ensures depth scale consistency across the six-face cubemap while preserving 3D structural integrity. Furthermore, as foundation models exhibit inherent robustness in zero-shot settings, our method achieves superior performance across diverse datasets, including Matterport3D, Stanford2D3D, and 360Loc. We also demonstrate the versatility of our depth estimation approach by validating its benefits in downstream tasks such as feature matching 3.2 ~ 5.4% and Structure from Motion 0.2 ~ 9.7% in AUC@5.
Related papers
- Depth Anything in $360^\circ$: Towards Scale Invariance in the Wild [12.6239596554452]
We present Depth Anything in $360circ$ (DA360), a panoramic-adapted version of Depth Anything V2.<n>Our key innovation involves learning a shift parameter from the ViT backbone, transforming the model's scale- and shift-invariant output into a scale-invariant estimate.
arXiv Detail & Related papers (2025-12-28T07:12:58Z) - An End-to-End Room Geometry Constrained Depth Estimation Framework for Indoor Panorama Images [50.84536164535991]
Existing methods focus on pixel-level accuracy, causing oversmoothed room corners and noise sensitivity.<n>We propose a depth estimation framework based on room geometry constraints.<n>Our framework incorporates two strategies: a room geometry-based background depth resolving strategy and a background-segmentation-guided fusion mechanism.
arXiv Detail & Related papers (2025-10-09T05:52:48Z) - Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation [6.832852988957967]
We propose a new depth estimation framework that utilizes unlabeled 360-degree data effectively.
Our approach uses state-of-the-art perspective depth estimation models as teacher models to generate pseudo labels.
We tested our approach on benchmark datasets such as Matterport3D and Stanford2D3D, showing significant improvements in depth estimation accuracy.
arXiv Detail & Related papers (2024-06-18T17:59:31Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.<n>Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular
Video Depth [90.33296913575818]
In some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency.
We propose a locally weighted linear regression method to recover the scale and shift with very sparse anchor points.
Our method can boost the performance of existing state-of-the-art approaches by 50% at most over several zero-shot benchmarks.
arXiv Detail & Related papers (2022-02-03T08:52:54Z) - Dense Depth Estimation from Multiple 360-degree Images Using Virtual
Depth [4.984601297028257]
The proposed pipeline leverages a spherical camera model that compensates for radial distortion in 360degree: images.
We propose an effective dense depth estimation method by setting virtual depth and minimizing photonic reprojection error.
The experimental results verify that the proposed pipeline improves estimation accuracy compared to the current state-of-art dense depth estimation methods.
arXiv Detail & Related papers (2021-12-30T05:27:28Z) - 360MonoDepth: High-Resolution 360{\deg} Monocular Depth Estimation [15.65828728205071]
monocular depth estimation remains a challenge for 360deg data.
Current CNN-based methods do not support such high resolutions due to limited GPU memory.
We propose a flexible framework for monocular depth estimation from high-resolution 360deg images using tangent images.
arXiv Detail & Related papers (2021-11-30T18:57:29Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.