Related papers: Pseudocylindrical Convolutions for Learned Omnidirectional Image Compression

Pseudocylindrical Convolutions for Learned Omnidirectional Image Compression

URL: http://arxiv.org/abs/2112.13227v1
Date: Sat, 25 Dec 2021 12:18:32 GMT
Title: Pseudocylindrical Convolutions for Learned Omnidirectional Image Compression
Authors: Mu Li, Kede Ma, Jinxing Li, and David Zhang
Abstract summary: We make one of the first attempts to learn deep neural networks for omnidirectional image compression. Under reasonable constraints on the parametric representation, the pseudocylindrical convolution can be efficiently implemented by standard convolution. Experimental results show that our method consistently achieves better rate-distortion performance than competing methods.
Score: 42.15877732557837
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although equirectangular projection (ERP) is a convenient form to store omnidirectional images (also known as 360-degree images), it is neither equal-area nor conformal, thus not friendly to subsequent visual communication. In the context of image compression, ERP will over-sample and deform things and stuff near the poles, making it difficult for perceptually optimal bit allocation. In conventional 360-degree image compression, techniques such as region-wise packing and tiled representation are introduced to alleviate the over-sampling problem, achieving limited success. In this paper, we make one of the first attempts to learn deep neural networks for omnidirectional image compression. We first describe parametric pseudocylindrical representation as a generalization of common pseudocylindrical map projections. A computationally tractable greedy method is presented to determine the (sub)-optimal configuration of the pseudocylindrical representation in terms of a novel proxy objective for rate-distortion performance. We then propose pseudocylindrical convolutions for 360-degree image compression. Under reasonable constraints on the parametric representation, the pseudocylindrical convolution can be efficiently implemented by standard convolution with the so-called pseudocylindrical padding. To demonstrate the feasibility of our idea, we implement an end-to-end 360-degree image compression system, consisting of the learned pseudocylindrical representation, an analysis transform, a non-uniform quantizer, a synthesis transform, and an entropy model. Experimental results on $19,790$ omnidirectional images show that our method achieves consistently better rate-distortion performance than the competing methods. Moreover, the visual quality by our method is significantly improved for all images at all bitrates.

Related papers

CylinderPlane: Nested Cylinder Representation for 3D-aware Image Generation [30.123631491028352]
CylinderPlane is a novel implicit representation based on Cylindrical Coordinate System.<n>Our representation is agnostic to implicit rendering methods and can be easily integrated into any neural rendering pipeline.
arXiv Detail & Related papers (2025-07-21T13:28:59Z)
SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception [61.7243424157871]
We introduce a transformer-based architecture that, by incorporating a novel Spherical Local Self-Attention'' and other spherically-oriented modules, successfully operates in the spherical domain and outperforms the state-of-the-art in 360$degree$ perception benchmarks for depth estimation and semantic segmentation.
arXiv Detail & Related papers (2024-12-09T20:23:10Z)
Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model [11.959608742884408]
BiSIC is a symmetric stereo image compression architecture. We propose a 3D convolution based backbone to capture local features and incorporate bidirectional attention blocks to exploit global features. Our proposed BiSIC outperforms conventional image/video compression standards.
arXiv Detail & Related papers (2024-07-15T11:36:22Z)
PSC: Posterior Sampling-Based Compression [34.50287066865267]
Posterior Sampling-based Compression (PSC) is a zero-shot compression method that leverages a pre-trained diffusion model as its sole neural network component. PSC constructs a transform that is adaptive to the image. We demonstrate that PSC's performance is comparable to established training-based methods in terms of rate, distortion, and perceptual quality.
arXiv Detail & Related papers (2024-07-13T14:24:22Z)
Explicit Correspondence Matching for Generalizable Neural Radiance Fields [49.49773108695526]
We present a new NeRF method that is able to generalize to new unseen scenarios and perform novel view synthesis with as few as two source views. The explicit correspondence matching is quantified with the cosine similarity between image features sampled at the 2D projections of a 3D point on different views. Our method achieves state-of-the-art results on different evaluation settings, with the experiments showing a strong correlation between our learned cosine feature similarity and volume density.
arXiv Detail & Related papers (2023-04-24T17:46:01Z)
Complementary Bi-directional Feature Compression for Indoor 360{\deg} Semantic Segmentation with Self-distillation [37.82642960470551]
We propose a novel 360deg semantic segmentation solution from a complementary perspective. Our approach outperforms the state-of-the-art solutions with at least 10% improvement on quantitative evaluations.
arXiv Detail & Related papers (2022-07-06T05:05:54Z)
Hybrid Model-based / Data-driven Graph Transform for Image Coding [54.31406300524195]
We present a hybrid model-based / data-driven approach to encode an intra-prediction residual block. The first $K$ eigenvectors of a transform matrix are derived from a statistical model, e.g., the asymmetric discrete sine transform (ADST) for stability. Using WebP as a baseline image, experimental results show that our hybrid graph transform achieved better energy compaction than default discrete cosine transform (DCT) and better stability than KLT.
arXiv Detail & Related papers (2022-03-02T15:36:44Z)
Rectifying homographies for stereo vision: analytical solution for minimal distortion [0.0]
Rectification is used to simplify the subsequent stereo correspondence problem. This work proposes a closed-form solution for the rectifying homographies that minimise perspective distortion.
arXiv Detail & Related papers (2022-02-28T22:35:47Z)
CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning. The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery. The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z)
OSLO: On-the-Sphere Learning for Omnidirectional images and its application to 360-degree image compression [59.58879331876508]
We study the learning of representation models for omnidirectional images and propose to use the properties of HEALPix uniform sampling of the sphere to redefine the mathematical tools used in deep learning models for omnidirectional images. Our proposed on-the-sphere solution leads to a better compression gain that can save 13.7% of the bit rate compared to similar learned models applied to equirectangular images.
arXiv Detail & Related papers (2021-07-19T22:14:30Z)
Substitutional Neural Image Compression [48.20906717052056]
Substitutional Neural Image Compression (SNIC) is a general approach for enhancing any neural image compression model. It boosts compression performance toward a flexible distortion metric and enables bit-rate control using a single model instance.
arXiv Detail & Related papers (2021-05-16T20:53:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.