OSLO: On-the-Sphere Learning for Omnidirectional images and its
application to 360-degree image compression
- URL: http://arxiv.org/abs/2107.09179v1
- Date: Mon, 19 Jul 2021 22:14:30 GMT
- Title: OSLO: On-the-Sphere Learning for Omnidirectional images and its
application to 360-degree image compression
- Authors: Navid Mahmoudian Bidgoli, Roberto G. de A. Azevedo, Thomas Maugey,
Aline Roumy, Pascal Frossard
- Abstract summary: We study the learning of representation models for omnidirectional images and propose to use the properties of HEALPix uniform sampling of the sphere to redefine the mathematical tools used in deep learning models for omnidirectional images.
Our proposed on-the-sphere solution leads to a better compression gain that can save 13.7% of the bit rate compared to similar learned models applied to equirectangular images.
- Score: 59.58879331876508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art 2D image compression schemes rely on the power of
convolutional neural networks (CNNs). Although CNNs offer promising
perspectives for 2D image compression, extending such models to omnidirectional
images is not straightforward. First, omnidirectional images have specific
spatial and statistical properties that can not be fully captured by current
CNN models. Second, basic mathematical operations composing a CNN architecture,
e.g., translation and sampling, are not well-defined on the sphere. In this
paper, we study the learning of representation models for omnidirectional
images and propose to use the properties of HEALPix uniform sampling of the
sphere to redefine the mathematical tools used in deep learning models for
omnidirectional images. In particular, we: i) propose the definition of a new
convolution operation on the sphere that keeps the high expressiveness and the
low complexity of a classical 2D convolution; ii) adapt standard CNN techniques
such as stride, iterative aggregation, and pixel shuffling to the spherical
domain; and then iii) apply our new framework to the task of omnidirectional
image compression. Our experiments show that our proposed on-the-sphere
solution leads to a better compression gain that can save 13.7% of the bit rate
compared to similar learned models applied to equirectangular images. Also,
compared to learning models based on graph convolutional networks, our solution
supports more expressive filters that can preserve high frequencies and provide
a better perceptual quality of the compressed images. Such results demonstrate
the efficiency of the proposed framework, which opens new research venues for
other omnidirectional vision tasks to be effectively implemented on the sphere
manifold.
Related papers
- Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers [8.781861951759948]
This paper presents Geo-SIC, the first deep learning model to learn deformable shapes in a deformation space for an improved performance of image classification.
We introduce a newly designed framework that (i) simultaneously derives features from both image and latent shape spaces with large intra-class variations.
We develop a boosted classification network, equipped with an unsupervised learning of geometric shape representations.
arXiv Detail & Related papers (2022-10-25T01:55:17Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - Interpolated SelectionConv for Spherical Images and Surfaces [0.0]
We present a new and general framework for convolutional neural network operations on spherical images.
Our approach represents the surface as a graph of connected points that doesn't rely on a particular sampling strategy.
arXiv Detail & Related papers (2022-10-18T19:49:07Z) - Spherical Image Inpainting with Frame Transformation and Data-driven
Prior Deep Networks [13.406134708071345]
In this work, we focus on the challenging task of spherical image inpainting with deep learning-based regularizer.
We employ a fast directional spherical Haar framelet transform and develop a novel optimization framework based on a sparsity assumption of the framelet transform.
We show that the proposed algorithms can greatly recover damaged spherical images and achieve the best performance over purely using deep learning denoiser and plug-and-play model.
arXiv Detail & Related papers (2022-09-29T07:51:27Z) - SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for
Spatial-Aware Visual Representations [85.38562724999898]
We propose a 2D Image and 3D Point cloud Unsupervised pre-training strategy, called SimIPU.
Specifically, we develop a multi-modal contrastive learning framework that consists of an intra-modal spatial perception module and an inter-modal feature interaction module.
To the best of our knowledge, this is the first study to explore contrastive learning pre-training strategies for outdoor multi-modal datasets.
arXiv Detail & Related papers (2021-12-09T03:27:00Z) - Neural Knitworks: Patched Neural Implicit Representation Networks [1.0470286407954037]
We propose Knitwork, an architecture for neural implicit representation learning of natural images that achieves image synthesis.
To the best of our knowledge, this is the first implementation of a coordinate-based patch tailored for synthesis tasks such as image inpainting, super-resolution, and denoising.
The results show that modeling natural images using patches, rather than pixels, produces results of higher fidelity.
arXiv Detail & Related papers (2021-09-29T13:10:46Z) - Concentric Spherical GNN for 3D Representation Learning [53.45704095146161]
We propose a novel multi-resolution convolutional architecture for learning over concentric spherical feature maps.
Our hierarchical architecture is based on alternatively learning to incorporate both intra-sphere and inter-sphere information.
We demonstrate the effectiveness of our approach in improving state-of-the-art performance on 3D classification tasks with rotated data.
arXiv Detail & Related papers (2021-03-18T19:05:04Z) - Image Restoration by Deep Projected GSURE [115.57142046076164]
Ill-posed inverse problems appear in many image processing applications, such as deblurring and super-resolution.
We propose a new image restoration framework that is based on minimizing a loss function that includes a "projected-version" of the Generalized SteinUnbiased Risk Estimator (GSURE) and parameterization of the latent image by a CNN.
arXiv Detail & Related papers (2021-02-04T08:52:46Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.