Foveation-based Deep Video Compression without Motion Search
- URL: http://arxiv.org/abs/2203.16490v1
- Date: Wed, 30 Mar 2022 17:30:17 GMT
- Title: Foveation-based Deep Video Compression without Motion Search
- Authors: Meixu Chen, Richard Webb, Alan C. Bovik
- Abstract summary: Foveation protocols are desirable since only a small portion of a video viewed in VR may be visible as a user gazes in any given direction.
We implement foveation by introducing a Foveation Generator Unit (FGU) that generates foveation masks which direct the allocation of bits.
Our new compression model, which we call the Foveated MOtionless VIdeo Codec (Foveated MOVI-Codec), is able to efficiently compress videos without computing motion.
- Score: 43.70396515286677
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The requirements of much larger file sizes, different storage formats, and
immersive viewing conditions of VR pose significant challenges to the goals of
acquiring, transmitting, compressing, and displaying high-quality VR content.
At the same time, the great potential of deep learning to advance progress on
the video compression problem has driven a significant research effort. Because
of the high bandwidth requirements of VR, there has also been significant
interest in the use of space-variant, foveated compression protocols. We have
integrated these techniques to create an end-to-end deep learning video
compression framework. A feature of our new compression model is that it
dispenses with the need for expensive search-based motion prediction
computations. This is accomplished by exploiting statistical regularities
inherent in video motion expressed by displaced frame differences. Foveation
protocols are desirable since only a small portion of a video viewed in VR may
be visible as a user gazes in any given direction. Moreover, even within a
current field of view (FOV), the resolution of retinal neurons rapidly
decreases with distance (eccentricity) from the projected point of gaze. In our
learning based approach, we implement foveation by introducing a Foveation
Generator Unit (FGU) that generates foveation masks which direct the allocation
of bits, significantly increasing compression efficiency while making it
possible to retain an impression of little to no additional visual loss given
an appropriate viewing geometry. Our experiment results reveal that our new
compression model, which we call the Foveated MOtionless VIdeo Codec (Foveated
MOVI-Codec), is able to efficiently compress videos without computing motion,
while outperforming foveated version of both H.264 and H.265 on the widely used
UVG dataset and on the HEVC Standard Class B Test Sequences.
Related papers
- End-to-End Learnable Multi-Scale Feature Compression for VCM [8.037759667748768]
We propose a novel multi-scale feature compression method that enables the end-to-end optimization on the extracted features and the design of lightweight encoders.
Our model outperforms previous approaches by at least 52% BD-rate reduction and has $times5$ to $times27$ times less encoding time for object detection.
arXiv Detail & Related papers (2023-06-29T04:05:13Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - Perceptual Learned Video Compression with Recurrent Conditional GAN [158.0726042755]
We propose a Perceptual Learned Video Compression (PLVC) approach with recurrent conditional generative adversarial network.
PLVC learns to compress video towards good perceptual quality at low bit-rate.
The user study further validates the outstanding perceptual performance of PLVC in comparison with the latest learned video compression approaches.
arXiv Detail & Related papers (2021-09-07T13:36:57Z) - Evaluating Foveated Video Quality Using Entropic Differencing [1.5877673959068452]
We propose a full reference (FR) foveated image quality assessment algorithm, which employs the natural scene statistics of bandpass responses.
We evaluate the proposed algorithm by measuring the correlations of the predictions that FED makes against human judgements.
The performance of the proposed algorithm yields state-of-the-art as compared with other existing full reference algorithms.
arXiv Detail & Related papers (2021-06-12T16:29:13Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z) - Feedback Recurrent Autoencoder for Video Compression [14.072596106425072]
We propose a new network architecture for learned video compression operating in low latency mode.
Our method yields state of the art MS-SSIM/rate performance on the high-resolution UVG dataset.
arXiv Detail & Related papers (2020-04-09T02:58:07Z) - Video Coding for Machines: A Paradigm of Collaborative Compression and
Intelligent Analytics [127.65410486227007]
Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale.
Recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, promote the sustainable and fast development in their own directions.
In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG
arXiv Detail & Related papers (2020-01-10T17:24:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.