Pseudocylindrical Convolutions for Learned Omnidirectional Image
Compression
- URL: http://arxiv.org/abs/2112.13227v1
- Date: Sat, 25 Dec 2021 12:18:32 GMT
- Title: Pseudocylindrical Convolutions for Learned Omnidirectional Image
Compression
- Authors: Mu Li, Kede Ma, Jinxing Li, and David Zhang
- Abstract summary: We make one of the first attempts to learn deep neural networks for omnidirectional image compression.
Under reasonable constraints on the parametric representation, the pseudocylindrical convolution can be efficiently implemented by standard convolution.
Experimental results show that our method consistently achieves better rate-distortion performance than competing methods.
- Score: 42.15877732557837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although equirectangular projection (ERP) is a convenient form to store
omnidirectional images (also known as 360-degree images), it is neither
equal-area nor conformal, thus not friendly to subsequent visual communication.
In the context of image compression, ERP will over-sample and deform things and
stuff near the poles, making it difficult for perceptually optimal bit
allocation. In conventional 360-degree image compression, techniques such as
region-wise packing and tiled representation are introduced to alleviate the
over-sampling problem, achieving limited success. In this paper, we make one of
the first attempts to learn deep neural networks for omnidirectional image
compression. We first describe parametric pseudocylindrical representation as a
generalization of common pseudocylindrical map projections. A computationally
tractable greedy method is presented to determine the (sub)-optimal
configuration of the pseudocylindrical representation in terms of a novel proxy
objective for rate-distortion performance. We then propose pseudocylindrical
convolutions for 360-degree image compression. Under reasonable constraints on
the parametric representation, the pseudocylindrical convolution can be
efficiently implemented by standard convolution with the so-called
pseudocylindrical padding. To demonstrate the feasibility of our idea, we
implement an end-to-end 360-degree image compression system, consisting of the
learned pseudocylindrical representation, an analysis transform, a non-uniform
quantizer, a synthesis transform, and an entropy model. Experimental results on
$19,790$ omnidirectional images show that our method achieves consistently
better rate-distortion performance than the competing methods. Moreover, the
visual quality by our method is significantly improved for all images at all
bitrates.
Related papers
- Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model [11.959608742884408]
BiSIC is a symmetric stereo image compression architecture.
We propose a 3D convolution based backbone to capture local features and incorporate bidirectional attention blocks to exploit global features.
Our proposed BiSIC outperforms conventional image/video compression standards.
arXiv Detail & Related papers (2024-07-15T11:36:22Z) - Explicit Correspondence Matching for Generalizable Neural Radiance
Fields [49.49773108695526]
We present a new NeRF method that is able to generalize to new unseen scenarios and perform novel view synthesis with as few as two source views.
The explicit correspondence matching is quantified with the cosine similarity between image features sampled at the 2D projections of a 3D point on different views.
Our method achieves state-of-the-art results on different evaluation settings, with the experiments showing a strong correlation between our learned cosine feature similarity and volume density.
arXiv Detail & Related papers (2023-04-24T17:46:01Z) - Complementary Bi-directional Feature Compression for Indoor 360{\deg}
Semantic Segmentation with Self-distillation [37.82642960470551]
We propose a novel 360deg semantic segmentation solution from a complementary perspective.
Our approach outperforms the state-of-the-art solutions with at least 10% improvement on quantitative evaluations.
arXiv Detail & Related papers (2022-07-06T05:05:54Z) - Hybrid Model-based / Data-driven Graph Transform for Image Coding [54.31406300524195]
We present a hybrid model-based / data-driven approach to encode an intra-prediction residual block.
The first $K$ eigenvectors of a transform matrix are derived from a statistical model, e.g., the asymmetric discrete sine transform (ADST) for stability.
Using WebP as a baseline image, experimental results show that our hybrid graph transform achieved better energy compaction than default discrete cosine transform (DCT) and better stability than KLT.
arXiv Detail & Related papers (2022-03-02T15:36:44Z) - Rectifying homographies for stereo vision: analytical solution for
minimal distortion [0.0]
Rectification is used to simplify the subsequent stereo correspondence problem.
This work proposes a closed-form solution for the rectifying homographies that minimise perspective distortion.
arXiv Detail & Related papers (2022-02-28T22:35:47Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - OSLO: On-the-Sphere Learning for Omnidirectional images and its
application to 360-degree image compression [59.58879331876508]
We study the learning of representation models for omnidirectional images and propose to use the properties of HEALPix uniform sampling of the sphere to redefine the mathematical tools used in deep learning models for omnidirectional images.
Our proposed on-the-sphere solution leads to a better compression gain that can save 13.7% of the bit rate compared to similar learned models applied to equirectangular images.
arXiv Detail & Related papers (2021-07-19T22:14:30Z) - Substitutional Neural Image Compression [48.20906717052056]
Substitutional Neural Image Compression (SNIC) is a general approach for enhancing any neural image compression model.
It boosts compression performance toward a flexible distortion metric and enables bit-rate control using a single model instance.
arXiv Detail & Related papers (2021-05-16T20:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.