End-to-End Rate-Distortion Optimized Learned Hierarchical Bi-Directional
Video Compression
- URL: http://arxiv.org/abs/2112.09529v1
- Date: Fri, 17 Dec 2021 14:30:22 GMT
- Title: End-to-End Rate-Distortion Optimized Learned Hierarchical Bi-Directional
Video Compression
- Authors: M.Ak{\i}n Y{\i}lmaz, A.Murat Tekalp
- Abstract summary: Learned VC allows end-to-end rate-distortion (R-D) optimized training of nonlinear transform, motion and entropy model simultaneously.
This paper proposes a learned hierarchical bi-directional video (LHBDC) that combines the benefits of hierarchical motion-sampling and end-to-end optimization.
- Score: 10.885590093103344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conventional video compression (VC) methods are based on motion compensated
transform coding, and the steps of motion estimation, mode and quantization
parameter selection, and entropy coding are optimized individually due to the
combinatorial nature of the end-to-end optimization problem. Learned VC allows
end-to-end rate-distortion (R-D) optimized training of nonlinear transform,
motion and entropy model simultaneously. Most works on learned VC consider
end-to-end optimization of a sequential video codec based on R-D loss averaged
over pairs of successive frames. It is well-known in conventional VC that
hierarchical, bi-directional coding outperforms sequential compression because
of its ability to use both past and future reference frames. This paper
proposes a learned hierarchical bi-directional video codec (LHBDC) that
combines the benefits of hierarchical motion-compensated prediction and
end-to-end optimization. Experimental results show that we achieve the best R-D
results that are reported for learned VC schemes to date in both PSNR and
MS-SSIM. Compared to conventional video codecs, the R-D performance of our
end-to-end optimized codec outperforms those of both x265 and SVT-HEVC encoders
("veryslow" preset) in PSNR and MS-SSIM as well as HM 16.23 reference software
in MS-SSIM. We present ablation studies showing performance gains due to
proposed novel tools such as learned masking, flow-field subsampling, and
temporal flow vector prediction. The models and instructions to reproduce our
results can be found in https://github.com/makinyilmaz/LHBDC/
Related papers
- High-Efficiency Neural Video Compression via Hierarchical Predictive Learning [27.41398149573729]
Enhanced Deep Hierarchical Video Compression-DHVC 2.0- introduces superior compression performance and impressive complexity efficiency.
Uses hierarchical predictive coding to transform each video frame into multiscale representations.
Supports transmission-friendly progressive decoding, making it particularly advantageous for networked video applications in the presence of packet loss.
arXiv Detail & Related papers (2024-10-03T15:40:58Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Efficient VVC Intra Prediction Based on Deep Feature Fusion and
Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation.
Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z) - Neural Data-Dependent Transform for Learned Image Compression [72.86505042102155]
We build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image.
The experimental results show the effectiveness of the proposed neural-syntax design and the continuous online mode decision mechanism.
arXiv Detail & Related papers (2022-03-09T14:56:48Z) - A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs)
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved.
We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - End-to-end Neural Video Coding Using a Compound Spatiotemporal
Representation [33.54844063875569]
We propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by two approaches.
Specifically, we generate a compoundtemporal representation (STR) through a recurrent information aggregation (RIA) module.
We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements.
arXiv Detail & Related papers (2021-08-05T19:43:32Z) - Decomposition, Compression, and Synthesis (DCS)-based Video Coding: A
Neural Exploration via Resolution-Adaptive Learning [30.54722074562783]
We decompose the input video into respective spatial texture frames (STF) at its native spatial resolution.
Then, we compress them together using any popular video coder.
Finally, we synthesize decoded STFs and TMFs for high-quality video reconstruction at the same resolution as its native input.
arXiv Detail & Related papers (2020-12-01T17:23:53Z) - End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video
Compression [10.404162481860634]
Learned video compression allows end-to-end rate-distortion optimized training of all nonlinear modules.
In this paper, we propose for the first time end-to-end optimization of a hierarchical, bi-directional motion compensated by accumulating cost function over fixed-size groups of pictures.
arXiv Detail & Related papers (2020-08-11T22:50:06Z) - Neural Video Coding using Multiscale Motion Compensation and
Spatiotemporal Context Model [45.46660511313426]
We propose an end-to-end deep neural video coding framework (NVC)
It uses variational autoencoders (VAEs) with joint spatial and temporal prior aggregation (PA) to exploit the correlations in intra-frame pixels, inter-frame motions and inter-frame compensation residuals.
NVC is evaluated for the low-delay causal settings and compared with H.265/HEVC, H.264/AVC and the other learnt video compression methods.
arXiv Detail & Related papers (2020-07-09T06:15:17Z) - M-LVC: Multiple Frames Prediction for Learned Video Compression [111.50760486258993]
We propose an end-to-end learned video compression scheme for low-latency scenarios.
In our scheme, the motion vector (MV) field is calculated between the current frame and the previous one.
Experimental results show that the proposed method outperforms the existing learned video compression methods for low-latency mode.
arXiv Detail & Related papers (2020-04-21T20:42:02Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.