Related papers: Neural Network based Inter bi-prediction Blending

Neural Network based Inter bi-prediction Blending

URL: http://arxiv.org/abs/2202.03149v1
Date: Wed, 26 Jan 2022 13:57:48 GMT
Title: Neural Network based Inter bi-prediction Blending
Authors: Franck Galpin, Philippe Bordes, Thierry Dumas, Pavel Nikitin, Fabrice Le Leannec
Abstract summary: This paper presents a learning-based method to improve bi-prediction in video coding. In this context, we introduce a simple neural network that further improves the blending operation. Tests are performed and show a BD-rate improvement of -1.4% in random access configuration for a network size of fewer than 10k parameters.
Score: 8.815673539598816
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a learning-based method to improve bi-prediction in video coding. In conventional video coding solutions, the motion compensation of blocks from already decoded reference pictures stands out as the principal tool used to predict the current frame. Especially, the bi-prediction, in which a block is obtained by averaging two different motion-compensated prediction blocks, significantly improves the final temporal prediction accuracy. In this context, we introduce a simple neural network that further improves the blending operation. A complexity balance, both in terms of network size and encoder mode selection, is carried out. Extensive tests on top of the recently standardized VVC codec are performed and show a BD-rate improvement of -1.4% in random access configuration for a network size of fewer than 10k parameters. We also propose a simple CPU-based implementation and direct network quantization to assess the complexity/gains tradeoff in a conventional codec framework.

Related papers

Motion Free B-frame Coding for Neural Video Compression [0.0]
In this paper, we propose a novel approach that handles the drawbacks of the two typical above-mentioned architectures. The advantages of the motion-free approach are twofold: it improves the coding efficiency of the network and significantly reduces computational complexity. Experimental results show the proposed framework outperforms the SOTA deep neural video compression networks on the HEVC-class B dataset.
arXiv Detail & Related papers (2024-11-26T07:03:11Z)
ReBotNet: Fast Real-time Video Enhancement [59.08038313427057]
Most restoration networks are slow, have high computational bottleneck, and can't be used for real-time video enhancement. In this work, we design an efficient and fast framework to perform real-time enhancement for practical use-cases like live video calls and video streams. To evaluate our method, we emulate two new datasets that real-world video call and streaming scenarios, and show extensive results on multiple datasets where ReBotNet outperforms existing approaches with lower computations, reduced memory requirements, and faster inference time.
arXiv Detail & Related papers (2023-03-23T17:58:05Z)
Anti-aliasing Predictive Coding Network for Future Video Frame Prediction [1.4610038284393165]
We introduce here a predictive coding based model that aims to generate accurate and sharp future frames. We propose and improve several artifacts to ensure that the neural networks generate clear and natural frames.
arXiv Detail & Related papers (2023-01-13T07:38:50Z)
A Scalable Graph Neural Network Decoder for Short Block Codes [49.25571364253986]
We propose a novel decoding algorithm for short block codes based on an edge-weighted graph neural network (EW-GNN) The EW-GNN decoder operates on the Tanner graph with an iterative message-passing structure. We show that the EW-GNN decoder outperforms the BP and deep-learning-based BP methods in terms of the decoding error rate.
arXiv Detail & Related papers (2022-11-13T17:13:12Z)
Graph Neural Networks for Channel Decoding [71.15576353630667]
We showcase competitive decoding performance for various coding schemes, such as low-density parity-check (LDPC) and BCH codes. The idea is to let a neural network (NN) learn a generalized message passing algorithm over a given graph. We benchmark our proposed decoder against state-of-the-art in conventional channel decoding as well as against recent deep learning-based results.
arXiv Detail & Related papers (2022-07-29T15:29:18Z)
A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs) The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved. We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z)
Learning Cross-Scale Prediction for Efficient Neural Video Compression [30.051859347293856]
We present the first neural video that can compete with the latest coding standard H.266/VVC in terms of sRGB PSNR on UVG dataset for the low-latency mode. We propose a novel cross-scale prediction module that achieves more effective motion compensation.
arXiv Detail & Related papers (2021-12-26T03:12:17Z)
Self-Supervised Learning of Perceptually Optimized Block Motion Estimates for Video Compression [50.48504867843605]
We propose a search-free block motion estimation framework using a multi-stage convolutional neural network. We deploy the multi-scale structural similarity (MS-SSIM) loss function to optimize the perceptual quality of the motion compensated predicted frames.
arXiv Detail & Related papers (2021-10-05T03:38:43Z)
Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues. We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z)
End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation [33.54844063875569]
We propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by two approaches. Specifically, we generate a compoundtemporal representation (STR) through a recurrent information aggregation (RIA) module. We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements.
arXiv Detail & Related papers (2021-08-05T19:43:32Z)
Improved CNN-based Learning of Interpolation Filters for Low-Complexity Inter Prediction in Video Coding [5.46121027847413]
This paper introduces a novel explainable neural network-based inter-prediction scheme. A novel training framework enables each network branch to resemble a specific fractional shift. When implemented in the context of the Versatile Video Coding (VVC) test model, 0.77%, 1.27% and 2.25% BD-rate savings can be achieved.
arXiv Detail & Related papers (2021-06-16T16:48:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.