Related papers: CNN-based Prediction of Partition Path for VVC Fast Inter Partitioning Using Motion Fields

CNN-based Prediction of Partition Path for VVC Fast Inter Partitioning Using Motion Fields

URL: http://arxiv.org/abs/2310.13838v1
Date: Fri, 20 Oct 2023 22:26:49 GMT
Title: CNN-based Prediction of Partition Path for VVC Fast Inter Partitioning Using Motion Fields
Authors: Yiqun Liu, Marc Riviere, Thomas Guionnet, Aline Roumy, Christine Guillemot
Abstract summary: The Versatile Video Coding (VVC) standard has been recently finalized by the Joint Video Exploration Team (JVET) VVC offers about 50% compression efficiency gain, at the cost of a 10-fold increase in encoding complexity. We propose a method based on Convolutional Neural Network (CNN) to speed up the inter partitioning process in VVC.
Score: 28.294065058301932
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: The Versatile Video Coding (VVC) standard has been recently finalized by the Joint Video Exploration Team (JVET). Compared to the High Efficiency Video Coding (HEVC) standard, VVC offers about 50% compression efficiency gain, in terms of Bjontegaard Delta-Rate (BD-rate), at the cost of a 10-fold increase in encoding complexity. In this paper, we propose a method based on Convolutional Neural Network (CNN) to speed up the inter partitioning process in VVC. Firstly, a novel representation for the quadtree with nested multi-type tree (QTMT) partition is introduced, derived from the partition path. Secondly, we develop a U-Net-based CNN taking a multi-scale motion vector field as input at the Coding Tree Unit (CTU) level. The purpose of CNN inference is to predict the optimal partition path during the Rate-Distortion Optimization (RDO) process. To achieve this, we divide CTU into grids and predict the Quaternary Tree (QT) depth and Multi-type Tree (MT) split decisions for each cell of the grid. Thirdly, an efficient partition pruning algorithm is introduced to employ the CNN predictions at each partitioning level to skip RDO evaluations of unnecessary partition paths. Finally, an adaptive threshold selection scheme is designed, making the trade-off between complexity and efficiency scalable. Experiments show that the proposed method can achieve acceleration ranging from 16.5% to 60.2% under the RandomAccess Group Of Picture 32 (RAGOP32) configuration with a reasonable efficiency drop ranging from 0.44% to 4.59% in terms of BD-rate, which surpasses other state-of-the-art solutions. Additionally, our method stands out as one of the lightest approaches in the field, which ensures its applicability to other encoders.

Related papers

Partition Map-Based Fast Block Partitioning for VVC Inter Coding [37.60581844783291]
We propose a partition map-based algorithm to pursue fast block partitioning in inter coding. Based on our previous work on partition map-based methods for intra coding, we analyze the characteristics of VVC inter coding. We present a dual-threshold decision scheme to achieve a fine-grained trade-off between complexity reduction and rate-distortion (RD) performance loss.
arXiv Detail & Related papers (2025-04-25T14:53:03Z)
Advanced Learning-Based Inter Prediction for Future Video Coding [46.4999280984859]
The paper proposes a low complexity learning-based inter prediction (LLIP) method to replace the traditional INTERPF. LLIP enhances the filtering process by leveraging a lightweight neural network model, where parameters can be exported for efficient inference. Ultimately, we replace the traditional handcraft filtering parameters in INTERPF with the learned optimal filtering parameters.
arXiv Detail & Related papers (2024-11-24T08:47:00Z)
Object Segmentation-Assisted Inter Prediction for Versatile Video Coding [53.91821712591901]
We propose an object segmentation-assisted inter prediction method (SAIP), where objects in the reference frames are segmented by some advanced technologies. With a proper indication, the object segmentation mask is translated from the reference frame to the current frame as the arbitrary-shaped partition of different regions. We show that the proposed method achieves up to 1.98%, 1.14%, 0.79%, and on average 0.82%, 0.49%, 0.37% BD-rate reduction for common test sequences.
arXiv Detail & Related papers (2024-03-18T11:48:20Z)
Light-weight CNN-based VVC Inter Partitioning Acceleration [28.62405283825515]
The Versatile Video Coding (VVC) standard has been finalized by Joint Video Exploration Team (JVET) in 2020. VVC offers about 50% compression efficiency gain, in terms of Bjontegaard Delta-Rate (BD-rate) We propose a Convolutional Neural Network (CNN)-based method to speed up inter partitioning in VVC.
arXiv Detail & Related papers (2023-12-17T00:20:02Z)
Lightweight and Progressively-Scalable Networks for Semantic Segmentation [100.63114424262234]
Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation. In this paper, we thoroughly analyze the design of convolutional blocks and the ways of interactions across multiple scales. We devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner.
arXiv Detail & Related papers (2022-07-27T16:00:28Z)
Receptive Field-based Segmentation for Distributed CNN Inference Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network. We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z)
Efficient VVC Intra Prediction Based on Deep Feature Fusion and Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation. Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels. We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z)
ACP: Automatic Channel Pruning via Clustering and Swarm Intelligence Optimization for CNN [6.662639002101124]
convolutional neural network (CNN) gets deeper and wider in recent years. Existing magnitude-based pruning methods are efficient, but the performance of the compressed network is unpredictable. We propose a novel automatic channel pruning method (ACP) ACP is evaluated against several state-of-the-art CNNs on three different classification datasets.
arXiv Detail & Related papers (2021-01-16T08:56:38Z)
I/O Lower Bounds for Auto-tuning of Convolutions in CNNs [2.571796445061562]
We develop a general I/O lower bound theory for a composite algorithm which consists of several different sub-computations. We design the near I/O-optimal dataflow strategies for the two main convolution algorithms by fully exploiting the data reuse. Experiment results show that our dataflow strategies with the auto-tuning approach can achieve about 3.32x performance speedup on average over cuDNN.
arXiv Detail & Related papers (2020-12-31T15:46:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.