Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective
- URL: http://arxiv.org/abs/2408.16227v2
- Date: Fri, 30 Aug 2024 13:48:14 GMT
- Title: Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective
- Authors: Zhijie Shen, Chunyu Lin, Lang Nie, Kang Liao,
- Abstract summary: We propose an oriented distortion-aware Gabor Fusion framework (PGFuse) to address the above challenges.
To address the reintroduced distortions, we design a linear latitude-aware distortion representation method to generate customized, distortion-aware Gabor filters.
Considering the orientation sensitivity of the Gabor transform, we introduce a spherical gradient constraint to stabilize this sensitivity.
- Score: 33.85582959047852
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depth estimation from a monocular 360 image is important to the perception of the entire 3D environment. However, the inherent distortion and large field of view (FoV) in 360 images pose great challenges for this task. To this end, existing mainstream solutions typically introduce additional perspective-based 360 representations (\textit{e.g.}, Cubemap) to achieve effective feature extraction. Nevertheless, regardless of the introduced representations, they eventually need to be unified into the equirectangular projection (ERP) format for the subsequent depth estimation, which inevitably reintroduces the troublesome distortions. In this work, we propose an oriented distortion-aware Gabor Fusion framework (PGFuse) to address the above challenges. First, we introduce Gabor filters that analyze texture in the frequency domain, thereby extending the receptive fields and enhancing depth cues. To address the reintroduced distortions, we design a linear latitude-aware distortion representation method to generate customized, distortion-aware Gabor filters (PanoGabor filters). Furthermore, we design a channel-wise and spatial-wise unidirectional fusion module (CS-UFM) that integrates the proposed PanoGabor filters to unify other representations into the ERP format, delivering effective and distortion-free features. Considering the orientation sensitivity of the Gabor transform, we introduce a spherical gradient constraint to stabilize this sensitivity. Experimental results on three popular indoor 360 benchmarks demonstrate the superiority of the proposed PGFuse to existing state-of-the-art solutions. Code can be available upon acceptance.
Related papers
- PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting [54.7468067660037]
PF3plat sets a new state-of-the-art across all benchmarks, supported by comprehensive ablation studies validating our design choices.
Our framework capitalizes on fast speed, scalability, and high-quality 3D reconstruction and view synthesis capabilities of 3DGS.
arXiv Detail & Related papers (2024-10-29T15:28:15Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - SGFormer: Spherical Geometry Transformer for 360 Depth Estimation [54.13459226728249]
Panoramic distortion poses a significant challenge in 360 depth estimation.
We propose a spherical geometry transformer, named SGFormer, to address the above issues.
We also present a query-based global conditional position embedding to compensate for spatial structure at varying resolutions.
arXiv Detail & Related papers (2024-04-23T12:36:24Z) - Distortion-aware Transformer in 360{\deg} Salient Object Detection [44.74647420381127]
We propose a Transformer-based model called DATFormer to address the distortion problem.
To exploit the unique characteristics of 360deg data, we present a learnable relation matrix.
Our model outperforms existing 2D SOD (salient object detection) and 360 SOD methods.
arXiv Detail & Related papers (2023-08-07T07:28:24Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion [12.058261716065381]
We propose a 360 monocular depth estimation pipeline, textit OmniFusion, to tackle the spherical distortion issue.
Our pipeline transforms a 360 image into less-distorted perspective patches (i.e. tangent images) to obtain patch-wise predictions via CNN, and then merge the patch-wise results for final output.
Experiments show that our method greatly mitigates the distortion issue, and achieves state-of-the-art performances on several 360 monocular depth estimation benchmark datasets.
arXiv Detail & Related papers (2022-03-02T03:19:49Z) - Distortion-Aware Loop Filtering of Intra 360^o Video Coding with
Equirectangular Projection [81.63407194858854]
We propose a distortion-aware loop filtering model to improve the performance of intra coding for 360$o$ videos projected via equirectangular projection (ERP) format.
Our proposed module analyzes content characteristics based on a coding unit (CU) partition mask and processes them through partial convolution to activate the specified area.
arXiv Detail & Related papers (2022-02-20T12:00:18Z) - SphereSR: 360{\deg} Image Super-Resolution with Arbitrary Projection via
Continuous Spherical Image Representation [27.10716804733828]
We propose a novel framework to generate a continuous spherical image representation from an LR 360degimage.
Specifically, we first propose a feature extraction module that represents the spherical data based on icosahedron.
We then propose a spherical local implicit image function (SLIIF) to predict RGB values at the spherical coordinates.
arXiv Detail & Related papers (2021-12-13T10:16:51Z) - Distortion-aware Monocular Depth Estimation for Omnidirectional Images [26.027353545874522]
We propose a Distortion-Aware Monocular Omnidirectional (DAMO) dense depth estimation network to address this challenge on indoor panoramas.
First, we introduce a distortion-aware module to extract calibrated semantic features from omnidirectional images.
Second, we introduce a plug-and-play spherical-aware weight matrix for our objective function to handle the uneven distribution of areas projected from a sphere.
arXiv Detail & Related papers (2020-10-18T08:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.