Deep Learning-Based Intra Mode Derivation for Versatile Video Coding
- URL: http://arxiv.org/abs/2204.04059v1
- Date: Fri, 8 Apr 2022 13:23:59 GMT
- Title: Deep Learning-Based Intra Mode Derivation for Versatile Video Coding
- Authors: Linwei Zhu, Yun Zhang, Na Li, Gangyi Jiang, and Sam Kwong
- Abstract summary: An intelligent intra mode derivation method is proposed in this paper, termed as Deep Learning based Intra Mode Derivation (DLIMD)
The architecture of DLIMD is developed to adapt to different quantization parameter settings and variable coding blocks including non-square ones.
The proposed method can achieve 2.28%, 1.74%, and 2.18% bit rate reduction on average for Y, U, and V components on the platform of Versatile Video Coding (VVC) test model.
- Score: 65.96100964146062
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In intra coding, Rate Distortion Optimization (RDO) is performed to achieve
the optimal intra mode from a pre-defined candidate list. The optimal intra
mode is also required to be encoded and transmitted to the decoder side besides
the residual signal, where lots of coding bits are consumed. To further improve
the performance of intra coding in Versatile Video Coding (VVC), an intelligent
intra mode derivation method is proposed in this paper, termed as Deep Learning
based Intra Mode Derivation (DLIMD). In specific, the process of intra mode
derivation is formulated as a multi-class classification task, which aims to
skip the module of intra mode signaling for coding bits reduction. The
architecture of DLIMD is developed to adapt to different quantization parameter
settings and variable coding blocks including non-square ones, which are
handled by one single trained model. Different from the existing deep learning
based classification problems, the hand-crafted features are also fed into the
intra mode derivation network besides the learned features from feature
learning network. To compete with traditional method, one additional binary
flag is utilized in the video codec to indicate the selected scheme with RDO.
Extensive experimental results reveal that the proposed method can achieve
2.28%, 1.74%, and 2.18% bit rate reduction on average for Y, U, and V
components on the platform of VVC test model, which outperforms the
state-of-the-art works.
Related papers
- Denoising Diffusion Error Correction Codes [92.10654749898927]
Recently, neural decoders have demonstrated their advantage over classical decoding techniques.
Recent state-of-the-art neural decoders suffer from high complexity and lack the important iterative scheme characteristic of many legacy decoders.
We propose to employ denoising diffusion models for the soft decoding of linear codes at arbitrary block lengths.
arXiv Detail & Related papers (2022-09-16T11:00:50Z) - Graph Neural Networks for Channel Decoding [71.15576353630667]
We showcase competitive decoding performance for various coding schemes, such as low-density parity-check (LDPC) and BCH codes.
The idea is to let a neural network (NN) learn a generalized message passing algorithm over a given graph.
We benchmark our proposed decoder against state-of-the-art in conventional channel decoding as well as against recent deep learning-based results.
arXiv Detail & Related papers (2022-07-29T15:29:18Z) - Efficient VVC Intra Prediction Based on Deep Feature Fusion and
Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation.
Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z) - Neural Data-Dependent Transform for Learned Image Compression [72.86505042102155]
We build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image.
The experimental results show the effectiveness of the proposed neural-syntax design and the continuous online mode decision mechanism.
arXiv Detail & Related papers (2022-03-09T14:56:48Z) - BLINC: Lightweight Bimodal Learning for Low-Complexity VVC Intra Coding [5.629161809575015]
Versatile Video Coding (VVC) achieves almost twice coding efficiency compared to its predecessor, the High Efficiency Video Coding (HEVC)
This paper proposes a novel machine learning approach that jointly and separately employs two modalities of features, to simplify the intra coding decision.
arXiv Detail & Related papers (2022-01-19T19:12:41Z) - End-to-end Neural Video Coding Using a Compound Spatiotemporal
Representation [33.54844063875569]
We propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by two approaches.
Specifically, we generate a compoundtemporal representation (STR) through a recurrent information aggregation (RIA) module.
We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements.
arXiv Detail & Related papers (2021-08-05T19:43:32Z) - Multitask Learning for VVC Quality Enhancement and Super-Resolution [11.446576112498596]
We propose a learning-based solution as a post-processing step to enhance the decoded VVC video quality.
Our method relies on multitask learning to perform both quality enhancement and super-resolution using a single shared network optimized for multiple levels.
arXiv Detail & Related papers (2021-04-16T19:05:26Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z) - Neural Video Coding using Multiscale Motion Compensation and
Spatiotemporal Context Model [45.46660511313426]
We propose an end-to-end deep neural video coding framework (NVC)
It uses variational autoencoders (VAEs) with joint spatial and temporal prior aggregation (PA) to exploit the correlations in intra-frame pixels, inter-frame motions and inter-frame compensation residuals.
NVC is evaluated for the low-delay causal settings and compared with H.265/HEVC, H.264/AVC and the other learnt video compression methods.
arXiv Detail & Related papers (2020-07-09T06:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.