Coarse-to-fine Deep Video Coding with Hyperprior-guided Mode Prediction
- URL: http://arxiv.org/abs/2206.07460v1
- Date: Wed, 15 Jun 2022 11:38:53 GMT
- Title: Coarse-to-fine Deep Video Coding with Hyperprior-guided Mode Prediction
- Authors: Zhihao Hu, Guo Lu, Jinyang Guo, Shan Liu, Wei Jiang and Dong Xu
- Abstract summary: We propose a coarse-to-fine (C2F) deep video compression framework for better motion compensation.
Our C2F framework can achieve better motion compensation results without significantly increasing bit costs.
- Score: 50.361427832256524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The previous deep video compression approaches only use the single scale
motion compensation strategy and rarely adopt the mode prediction technique
from the traditional standards like H.264/H.265 for both motion and residual
compression. In this work, we first propose a coarse-to-fine (C2F) deep video
compression framework for better motion compensation, in which we perform
motion estimation, compression and compensation twice in a coarse to fine
manner. Our C2F framework can achieve better motion compensation results
without significantly increasing bit costs. Observing hyperprior information
(i.e., the mean and variance values) from the hyperprior networks contains
discriminant statistical information of different patches, we also propose two
efficient hyperprior-guided mode prediction methods. Specifically, using
hyperprior information as the input, we propose two mode prediction networks to
respectively predict the optimal block resolutions for better motion coding and
decide whether to skip residual information from each block for better residual
coding without introducing additional bit cost while bringing negligible extra
computation cost. Comprehensive experimental results demonstrate our proposed
C2F video compression framework equipped with the new hyperprior-guided mode
prediction methods achieves the state-of-the-art performance on HEVC, UVG and
MCL-JCV datasets.
Related papers
- IBVC: Interpolation-driven B-frame Video Compression [68.18440522300536]
B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction.
Previous learned approaches often directly extend neural P-frame codecs to B-frame relying on bi-directional optical-flow estimation.
We propose a simple yet effective structure called Interpolation-B-frame Video Compression (IBVC) to address these issues.
arXiv Detail & Related papers (2023-09-25T02:45:51Z) - MMVC: Learned Multi-Mode Video Compression with Block-based Prediction
Mode Selection and Density-Adaptive Entropy Coding [21.147001610347832]
We propose a multi-mode video compression framework that selects the optimal mode for feature domain prediction adapting to different motion patterns.
For entropy coding, we consider both dense and sparse post-quantization residual blocks, and apply optional run-length coding to sparse residuals to improve the compression rate.
Compared with state-of-the-art video compression schemes and standard codecs, our method yields better or competitive results measured with PSNR and MS-SSIM.
arXiv Detail & Related papers (2023-04-05T07:37:48Z) - Scene Matters: Model-based Deep Video Compression [13.329074811293292]
We propose a model-based video compression (MVC) framework that regards scenes as the fundamental units for video sequences.
Our proposed MVC directly models novel intensity variation of the entire video sequence in one scene, seeking non-redundant representations instead of reducing redundancy.
Our method achieves up to a 20% reduction compared to the latest video standard H.266 and is more efficient in decoding than existing video coding strategies.
arXiv Detail & Related papers (2023-03-08T13:15:19Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Learning Cross-Scale Prediction for Efficient Neural Video Compression [30.051859347293856]
We present the first neural video that can compete with the latest coding standard H.266/VVC in terms of sRGB PSNR on UVG dataset for the low-latency mode.
We propose a novel cross-scale prediction module that achieves more effective motion compensation.
arXiv Detail & Related papers (2021-12-26T03:12:17Z) - Versatile Learned Video Compression [26.976302025254043]
We propose a versatile learned video compression (VLVC) framework that uses one model to support all possible prediction modes.
Specifically, to realize versatile compression, we first build a motion compensation module that applies multiple 3D motion vector fields.
We show that the flow prediction module can largely reduce the transmission cost of voxel flows.
arXiv Detail & Related papers (2021-11-05T10:50:37Z) - Self-Supervised Learning of Perceptually Optimized Block Motion
Estimates for Video Compression [50.48504867843605]
We propose a search-free block motion estimation framework using a multi-stage convolutional neural network.
We deploy the multi-scale structural similarity (MS-SSIM) loss function to optimize the perceptual quality of the motion compensated predicted frames.
arXiv Detail & Related papers (2021-10-05T03:38:43Z) - End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video
Compression [10.404162481860634]
Learned video compression allows end-to-end rate-distortion optimized training of all nonlinear modules.
In this paper, we propose for the first time end-to-end optimization of a hierarchical, bi-directional motion compensated by accumulating cost function over fixed-size groups of pictures.
arXiv Detail & Related papers (2020-08-11T22:50:06Z) - M-LVC: Multiple Frames Prediction for Learned Video Compression [111.50760486258993]
We propose an end-to-end learned video compression scheme for low-latency scenarios.
In our scheme, the motion vector (MV) field is calculated between the current frame and the previous one.
Experimental results show that the proposed method outperforms the existing learned video compression methods for low-latency mode.
arXiv Detail & Related papers (2020-04-21T20:42:02Z) - Accelerating Deep Reinforcement Learning With the Aid of Partial Model:
Energy-Efficient Predictive Video Streaming [97.75330397207742]
Predictive power allocation is conceived for energy-efficient video streaming over mobile networks using deep reinforcement learning.
To handle the continuous state and action spaces, we resort to deep deterministic policy gradient (DDPG) algorithm.
Our simulation results show that the proposed policies converge to the optimal policy that is derived based on perfect large-scale channel prediction.
arXiv Detail & Related papers (2020-03-21T17:36:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.