CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected CRFs
- URL: http://arxiv.org/abs/2405.11564v1
- Date: Sun, 19 May 2024 14:29:06 GMT
- Title: CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected CRFs
- Authors: Zidong Cao, Lin Wang,
- Abstract summary: Monocular 360 depth estimation is challenging due to the inherent distortion of the equirectangular projection (ERP) plane.
In this paper, we propose spherical fully-connected CRFs (SF-CRFs)
SF-CRFs enjoy two key components. Firstly, to involve sufficient spherical neighbors, we propose a Spherical Window Transform (SWT) module.
This module aims to replicate the equator window's spherical relationships to all other windows, leveraging the rotational invariance of the sphere.
Remarkably, the transformation process is highly efficient, completing the transformation of all windows in a 512
- Score: 5.854176164327896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular 360 depth estimation is challenging due to the inherent distortion of the equirectangular projection (ERP). This distortion causes a problem: spherical adjacent points are separated after being projected to the ERP plane, particularly in the polar regions. To tackle this problem, recent methods calculate the spherical neighbors in the tangent domain. However, as the tangent patch and sphere only have one common point, these methods construct neighboring spherical relationships around the common point. In this paper, we propose spherical fully-connected CRFs (SF-CRFs). We begin by evenly partitioning an ERP image with regular windows, where windows at the equator involve broader spherical neighbors than those at the poles. To improve the spherical relationships, our SF-CRFs enjoy two key components. Firstly, to involve sufficient spherical neighbors, we propose a Spherical Window Transform (SWT) module. This module aims to replicate the equator window's spherical relationships to all other windows, leveraging the rotational invariance of the sphere. Remarkably, the transformation process is highly efficient, completing the transformation of all windows in a 512X1024 ERP with 0.038 seconds on CPU. Secondly, we propose a Planar-Spherical Interaction (PSI) module to facilitate the relationships between regular and transformed windows, which not only preserves the local details but also captures global structures. By building a decoder based on the SF-CRFs blocks, we propose CRF360D, a novel 360 depth estimation framework that achieves state-of-the-art performance across diverse datasets. Our CRF360D is compatible with different perspective image-trained backbones (e.g., EfficientNet), serving as the encoder.
Related papers
- SGFormer: Spherical Geometry Transformer for 360 Depth Estimation [54.13459226728249]
Panoramic distortion poses a significant challenge in 360 depth estimation.
We propose a spherical geometry transformer, named SGFormer, to address the above issues.
We also present a query-based global conditional position embedding to compensate for spatial structure at varying resolutions.
arXiv Detail & Related papers (2024-04-23T12:36:24Z) - Spherical Feature Pyramid Networks For Semantic Segmentation [0.0]
We develop graph-based models for representing the signal on a spherical mesh.
Our models achieve state-of-the-art performance with an mIOU of 48.75, an improvement of 3.75 IoU points over the previous best spherical CNN.
arXiv Detail & Related papers (2023-07-05T21:19:13Z) - Sphere2Vec: A General-Purpose Location Representation Learning over a
Spherical Surface for Large-Scale Geospatial Predictions [73.60788465154572]
Current 2D and 3D location encoders are designed to model point distances in Euclidean space.
We propose a multi-scale location encoder called Sphere2Vec which can preserve spherical distances when encoding point coordinates on a spherical surface.
arXiv Detail & Related papers (2023-06-30T12:55:02Z) - NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth
Estimation [42.062788492398674]
Estimating the accurate depth from a single image is challenging since it is inherently ambiguous and ill-posed.
We take the path of CRFs optimization and leverage the potential of fully-connected CRFs.
Our method significantly improves the performance across all metrics on both the KITTI and NYUv2 datasets.
arXiv Detail & Related papers (2022-03-03T03:27:20Z) - Deep Weighted Consensus: Dense correspondence confidence maps for 3D
shape registration [8.325327265120283]
We present a new paradigm for rigid alignment between point clouds based on learnable weighted consensus.
We claim that we can align point clouds out of sampled matched points according to confidence level derived from a dense, soft alignment map.
The pipeline is differentiable, and converges under large rotations in the full spectrum of SO(3), even with high noise levels.
arXiv Detail & Related papers (2021-05-06T14:27:59Z) - Concentric Spherical GNN for 3D Representation Learning [53.45704095146161]
We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps.
Our hierarchical architecture is based on alternatively learning to incorporate both intra-sphere and inter-sphere information.
We demonstrate the effectiveness of our approach in improving state-of-the-art performance on 3D classification tasks with rotated data.
arXiv Detail & Related papers (2021-03-18T19:05:04Z) - Spin-Weighted Spherical CNNs [58.013031812072356]
We present a new type of spherical CNN that allows anisotropic filters in an efficient way, without ever leaving the sphere domain.
The key idea is to consider spin-weighted spherical functions, which were introduced in physics in the study of gravitational waves.
Our method outperforms previous methods on tasks like classification of spherical images, classification of 3D shapes and semantic segmentation of spherical panoramas.
arXiv Detail & Related papers (2020-06-18T17:57:21Z) - Region adaptive graph fourier transform for 3d point clouds [51.193111325231165]
We introduce the Region Adaptive Graph Fourier Transform (RA-GFT) for compression of 3D point cloud attributes.
The RA-GFT achieves better complexity-performance trade-offs than previous approaches.
arXiv Detail & Related papers (2020-03-04T02:47:44Z) - Quaternion Equivariant Capsule Networks for 3D Point Clouds [58.566467950463306]
We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations.
We connect dynamic routing between capsules to the well-known Weiszfeld algorithm.
Based on our operator, we build a capsule network that disentangles geometry from pose.
arXiv Detail & Related papers (2019-12-27T13:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.