Light-weight CNN-based VVC Inter Partitioning Acceleration
- URL: http://arxiv.org/abs/2312.10567v1
- Date: Sun, 17 Dec 2023 00:20:02 GMT
- Title: Light-weight CNN-based VVC Inter Partitioning Acceleration
- Authors: Yiqun Liu, Mohsen Abdoli, Thomas Guionnet, Christine Guillemot, Aline
Roumy
- Abstract summary: The Versatile Video Coding (VVC) standard has been finalized by Joint Video Exploration Team (JVET) in 2020.
VVC offers about 50% compression efficiency gain, in terms of Bjontegaard Delta-Rate (BD-rate)
We propose a Convolutional Neural Network (CNN)-based method to speed up inter partitioning in VVC.
- Score: 28.62405283825515
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Versatile Video Coding (VVC) standard has been finalized by Joint Video
Exploration Team (JVET) in 2020. Compared to the High Efficiency Video Coding
(HEVC) standard, VVC offers about 50% compression efficiency gain, in terms of
Bjontegaard Delta-Rate (BD-rate), at the cost of about 10x more encoder
complexity. In this paper, we propose a Convolutional Neural Network
(CNN)-based method to speed up inter partitioning in VVC. Our method operates
at the Coding Tree Unit (CTU) level, by splitting each CTU into a fixed grid of
8x8 blocks. Then each cell in this grid is associated with information about
the partitioning depth within that area. A lightweight network for predicting
this grid is employed during the rate-distortion optimization to limit the
Quaternary Tree (QT)-split search and avoid partitions that are unlikely to be
selected. Experiments show that the proposed method can achieve acceleration
ranging from 17% to 30% in the RandomAccess Group Of Picture 32 (RAGOP32) mode
of VVC Test Model (VTM)10 with a reasonable efficiency drop ranging from 0.37%
to 1.18% in terms of BD-rate increase.
Related papers
- Partition Map-Based Fast Block Partitioning for VVC Inter Coding [37.60581844783291]
We propose a partition map-based algorithm to pursue fast block partitioning in inter coding.
Based on our previous work on partition map-based methods for intra coding, we analyze the characteristics of VVC inter coding.
We present a dual-threshold decision scheme to achieve a fine-grained trade-off between complexity reduction and rate-distortion (RD) performance loss.
arXiv Detail & Related papers (2025-04-25T14:53:03Z) - Towards Practical Real-Time Neural Video Compression [60.390180067626396]
We introduce a practical real-time neural video (NVC) designed to deliver high compression ratio, low latency and broad versatility.
Experiments show our proposed DCVC-RT achieves an impressive average encoding/desampling speed 125.2/112.8 (frames per second) for 1080p video, while saving an average of 21% in fps compared to H.266/VTM.
arXiv Detail & Related papers (2025-02-28T06:32:23Z) - Channel-Aware Throughput Maximization for Cooperative Data Fusion in CAV [17.703608985129026]
Connected and autonomous vehicles (CAVs) have garnered significant attention due to their extended perception range and enhanced sensing coverage.
To address challenges such as blind spots and obstructions, CAVs employ vehicle-to-vehicle communications to aggregate data from surrounding vehicles.
We propose a channel-aware throughput approach to facilitate CAV data fusion, leveraging a self-supervised autoencoder for adaptive data compression.
arXiv Detail & Related papers (2024-10-06T00:43:46Z) - Object Segmentation-Assisted Inter Prediction for Versatile Video Coding [53.91821712591901]
We propose an object segmentation-assisted inter prediction method (SAIP), where objects in the reference frames are segmented by some advanced technologies.
With a proper indication, the object segmentation mask is translated from the reference frame to the current frame as the arbitrary-shaped partition of different regions.
We show that the proposed method achieves up to 1.98%, 1.14%, 0.79%, and on average 0.82%, 0.49%, 0.37% BD-rate reduction for common test sequences.
arXiv Detail & Related papers (2024-03-18T11:48:20Z) - CNN-based Prediction of Partition Path for VVC Fast Inter Partitioning
Using Motion Fields [28.294065058301932]
The Versatile Video Coding (VVC) standard has been recently finalized by the Joint Video Exploration Team (JVET)
VVC offers about 50% compression efficiency gain, at the cost of a 10-fold increase in encoding complexity.
We propose a method based on Convolutional Neural Network (CNN) to speed up the inter partitioning process in VVC.
arXiv Detail & Related papers (2023-10-20T22:26:49Z) - Leveraging progressive model and overfitting for efficient learned image
compression [14.937446839215868]
We introduce a powerful and flexible LIC framework with multi-scale progressive (MSP) probability model and latent representation overfitting (LOF) technique.
With different predefined profiles, the proposed framework can achieve various balance points between compression efficiency and computational complexity.
Experiments show that the proposed framework achieves 2.5%, 1.0%, and 1.3% Bjontegaard delta bit rate (BD-rate) reduction over the VVC/H.266 standard.
arXiv Detail & Related papers (2022-10-08T21:54:58Z) - Lightweight and Progressively-Scalable Networks for Semantic
Segmentation [100.63114424262234]
Multi-scale learning frameworks have been regarded as a capable class of models to boost semantic segmentation.
In this paper, we thoroughly analyze the design of convolutional blocks and the ways of interactions across multiple scales.
We devise Lightweight and Progressively-Scalable Networks (LPS-Net) that novelly expands the network complexity in a greedy manner.
arXiv Detail & Related papers (2022-07-27T16:00:28Z) - Efficient VVC Intra Prediction Based on Deep Feature Fusion and
Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation.
Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z) - Distortion-Aware Loop Filtering of Intra 360^o Video Coding with
Equirectangular Projection [81.63407194858854]
We propose a distortion-aware loop filtering model to improve the performance of intra coding for 360$o$ videos projected via equirectangular projection (ERP) format.
Our proposed module analyzes content characteristics based on a coding unit (CU) partition mask and processes them through partial convolution to activate the specified area.
arXiv Detail & Related papers (2022-02-20T12:00:18Z) - Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of
Autonomous Driving [2.6543018470131283]
We propose an efficient keypoints-based deep feature fusion framework, called FPV-RCNN, for collective perception.
Compared to a bird's-eye view (BEV) keypoints feature fusion, FPV-RCNN achieves improved detection accuracy by about 14%.
Our method also significantly decreases the CPM size to less than 0.3KB, which is about 50 times smaller than the BEV feature map sharing used in previous works.
arXiv Detail & Related papers (2021-09-23T19:41:02Z) - CT-Net: Channel Tensorization Network for Video Classification [48.4482794950675]
3D convolution is powerful for video classification but often computationally expensive.
Most approaches fail to achieve a preferable balance between convolutional efficiency and feature-interaction sufficiency.
We propose a concise and novel Channelization Network (CT-Net)
Our CT-Net outperforms a number of recent SOTA approaches, in terms of accuracy and/or efficiency.
arXiv Detail & Related papers (2021-06-03T05:35:43Z) - DeepCompress: Efficient Point Cloud Geometry Compression [1.808877001896346]
We propose a more efficient deep learning-based encoder architecture for point clouds compression.
We show that incorporating the learned activation function from Efficient Neural Image Compression (CENIC) yields dramatic gains in efficiency and performance.
Our proposed modifications outperform the baseline approaches by a small margin in terms of Bjontegard delta rate and PSNR values.
arXiv Detail & Related papers (2021-06-02T23:18:11Z) - ELF-VC: Efficient Learned Flexible-Rate Video Coding [61.10102916737163]
We propose several novel ideas for learned video compression which allow for improved performance for the low-latency mode.
We benchmark our method, which we call ELF-VC, on popular video test sets UVG and MCL-JCV.
Our approach runs at least 5x faster and has fewer parameters than all ML codecs which report these figures.
arXiv Detail & Related papers (2021-04-29T17:50:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.