Neural Network based Inter bi-prediction Blending
- URL: http://arxiv.org/abs/2202.03149v1
- Date: Wed, 26 Jan 2022 13:57:48 GMT
- Title: Neural Network based Inter bi-prediction Blending
- Authors: Franck Galpin, Philippe Bordes, Thierry Dumas, Pavel Nikitin, Fabrice
Le Leannec
- Abstract summary: This paper presents a learning-based method to improve bi-prediction in video coding.
In this context, we introduce a simple neural network that further improves the blending operation.
Tests are performed and show a BD-rate improvement of -1.4% in random access configuration for a network size of fewer than 10k parameters.
- Score: 8.815673539598816
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a learning-based method to improve bi-prediction in video
coding. In conventional video coding solutions, the motion compensation of
blocks from already decoded reference pictures stands out as the principal tool
used to predict the current frame. Especially, the bi-prediction, in which a
block is obtained by averaging two different motion-compensated prediction
blocks, significantly improves the final temporal prediction accuracy. In this
context, we introduce a simple neural network that further improves the
blending operation. A complexity balance, both in terms of network size and
encoder mode selection, is carried out. Extensive tests on top of the recently
standardized VVC codec are performed and show a BD-rate improvement of -1.4% in
random access configuration for a network size of fewer than 10k parameters. We
also propose a simple CPU-based implementation and direct network quantization
to assess the complexity/gains tradeoff in a conventional codec framework.
Related papers
- ReBotNet: Fast Real-time Video Enhancement [59.08038313427057]
Most restoration networks are slow, have high computational bottleneck, and can't be used for real-time video enhancement.
In this work, we design an efficient and fast framework to perform real-time enhancement for practical use-cases like live video calls and video streams.
To evaluate our method, we emulate two new datasets that real-world video call and streaming scenarios, and show extensive results on multiple datasets where ReBotNet outperforms existing approaches with lower computations, reduced memory requirements, and faster inference time.
arXiv Detail & Related papers (2023-03-23T17:58:05Z) - Anti-aliasing Predictive Coding Network for Future Video Frame
Prediction [1.4610038284393165]
We introduce here a predictive coding based model that aims to generate accurate and sharp future frames.
We propose and improve several artifacts to ensure that the neural networks generate clear and natural frames.
arXiv Detail & Related papers (2023-01-13T07:38:50Z) - A Scalable Graph Neural Network Decoder for Short Block Codes [49.25571364253986]
We propose a novel decoding algorithm for short block codes based on an edge-weighted graph neural network (EW-GNN)
The EW-GNN decoder operates on the Tanner graph with an iterative message-passing structure.
We show that the EW-GNN decoder outperforms the BP and deep-learning-based BP methods in terms of the decoding error rate.
arXiv Detail & Related papers (2022-11-13T17:13:12Z) - Graph Neural Networks for Channel Decoding [71.15576353630667]
We showcase competitive decoding performance for various coding schemes, such as low-density parity-check (LDPC) and BCH codes.
The idea is to let a neural network (NN) learn a generalized message passing algorithm over a given graph.
We benchmark our proposed decoder against state-of-the-art in conventional channel decoding as well as against recent deep learning-based results.
arXiv Detail & Related papers (2022-07-29T15:29:18Z) - A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs)
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved.
We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - Learning Cross-Scale Prediction for Efficient Neural Video Compression [30.051859347293856]
We present the first neural video that can compete with the latest coding standard H.266/VVC in terms of sRGB PSNR on UVG dataset for the low-latency mode.
We propose a novel cross-scale prediction module that achieves more effective motion compensation.
arXiv Detail & Related papers (2021-12-26T03:12:17Z) - Self-Supervised Learning of Perceptually Optimized Block Motion
Estimates for Video Compression [50.48504867843605]
We propose a search-free block motion estimation framework using a multi-stage convolutional neural network.
We deploy the multi-scale structural similarity (MS-SSIM) loss function to optimize the perceptual quality of the motion compensated predicted frames.
arXiv Detail & Related papers (2021-10-05T03:38:43Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - End-to-end Neural Video Coding Using a Compound Spatiotemporal
Representation [33.54844063875569]
We propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by two approaches.
Specifically, we generate a compoundtemporal representation (STR) through a recurrent information aggregation (RIA) module.
We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements.
arXiv Detail & Related papers (2021-08-05T19:43:32Z) - Improved CNN-based Learning of Interpolation Filters for Low-Complexity
Inter Prediction in Video Coding [5.46121027847413]
This paper introduces a novel explainable neural network-based inter-prediction scheme.
A novel training framework enables each network branch to resemble a specific fractional shift.
When implemented in the context of the Versatile Video Coding (VVC) test model, 0.77%, 1.27% and 2.25% BD-rate savings can be achieved.
arXiv Detail & Related papers (2021-06-16T16:48:01Z) - Learned Multi-Resolution Variable-Rate Image Compression with
Octave-based Residual Blocks [15.308823742699039]
We propose a new variable-rate image compression framework, which employs generalized octave convolutions (GoConv) and generalized octave transposed-convolutions (GoTConv)
To enable a single model to operate with different bit rates and to learn multi-rate image features, a new objective function is introduced.
Experimental results show that the proposed framework trained with variable-rate objective function outperforms the standard codecs such as H.265/HEVC-based BPG and state-of-the-art learning-based variable-rate methods.
arXiv Detail & Related papers (2020-12-31T06:26:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.