Optimal Transcoding Resolution Prediction for Efficient Per-Title
Bitrate Ladder Estimation
- URL: http://arxiv.org/abs/2401.04405v1
- Date: Tue, 9 Jan 2024 08:01:47 GMT
- Title: Optimal Transcoding Resolution Prediction for Efficient Per-Title
Bitrate Ladder Estimation
- Authors: Jinhai Yang, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang
- Abstract summary: We demonstrate that content-optimized features and ladders can be efficiently determined without any pre-encoding.
Our method well approximates the ground-truth-resolution pairs with a slight Bjontegaard Delta rate loss of 1.21%.
- Score: 9.332104035349932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adaptive video streaming requires efficient bitrate ladder construction to
meet heterogeneous network conditions and end-user demands. Per-title optimized
encoding typically traverses numerous encoding parameters to search the
Pareto-optimal operating points for each video. Recently, researchers have
attempted to predict the content-optimized bitrate ladder for pre-encoding
overhead reduction. However, existing methods commonly estimate the encoding
parameters on the Pareto front and still require subsequent pre-encodings. In
this paper, we propose to directly predict the optimal transcoding resolution
at each preset bitrate for efficient bitrate ladder construction. We adopt a
Temporal Attentive Gated Recurrent Network to capture spatial-temporal features
and predict transcoding resolutions as a multi-task classification problem. We
demonstrate that content-optimized bitrate ladders can thus be efficiently
determined without any pre-encoding. Our method well approximates the
ground-truth bitrate-resolution pairs with a slight Bj{\o}ntegaard Delta rate
loss of 1.21% and significantly outperforms the state-of-the-art fixed ladder.
Related papers
- Prediction and Reference Quality Adaptation for Learned Video Compression [54.58691829087094]
We propose a confidence-based prediction quality adaptation (PQA) module to provide explicit discrimination for the spatial and channel-wise prediction quality difference.
We also propose a reference quality adaptation (RQA) module and an associated repeat-long training strategy to provide dynamic spatially variant filters for diverse reference qualities.
arXiv Detail & Related papers (2024-06-20T09:03:26Z) - A Parametric Rate-Distortion Model for Video Transcoding [7.1741986121107235]
We introduce a parametric rate-distortion (R-D) transcoder model.
Our model excels at predicting distortion at various rates without the need for encoding the video.
It can be used to achieve visual quality improvement (in terms of PSNR) via trans-sizing.
arXiv Detail & Related papers (2024-04-13T15:37:57Z) - Efficient Bitrate Ladder Construction using Transfer Learning and Spatio-Temporal Features [12.631821085716853]
We propose an efficient ladder prediction method using transfer and learning features.
The method tested on 102 video scenes demonstrates 94.1% reduction in complexity versus brute-force at 1.71% BD-Rate expense.
arXiv Detail & Related papers (2024-01-06T11:37:20Z) - Towards Real-Time Neural Video Codec for Cross-Platform Application
Using Calibration Information [17.141950680993617]
Cross-platform computational errors resulting from floating point operations can lead to inaccurate decoding of the bitstream.
The high computational complexity of the encoding and decoding process poses a challenge in achieving real-time performance.
A real-time cross-platform neural video is capable of efficiently decoding of 720P video bitstream from other encoding platforms on a consumer-grade GPU.
arXiv Detail & Related papers (2023-09-20T13:01:15Z) - Dynamic Low-Rank Instance Adaptation for Universal Neural Image
Compression [33.92792778925365]
We propose a low-rank adaptation approach to address the rate-distortion drop observed in out-of-domain datasets.
Our proposed method exhibits universality across diverse image datasets.
arXiv Detail & Related papers (2023-08-15T12:17:46Z) - Efficient Per-Shot Convex Hull Prediction By Recurrent Learning [50.94452824380868]
We propose a deep learning based method of content aware convex hull prediction.
We employ a recurrent convolutional network (RCN) to implicitly analyze the complexity of video shots in order to predict their convex hulls.
Our experimental results reveal that our proposed model better approximations of the optimal convex hulls, and offers competitive time savings as compared to existing approaches.
arXiv Detail & Related papers (2022-06-10T05:11:02Z) - Efficient VVC Intra Prediction Based on Deep Feature Fusion and
Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation.
Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z) - AuxAdapt: Stable and Efficient Test-Time Adaptation for Temporally
Consistent Video Semantic Segmentation [81.87943324048756]
In video segmentation, generating temporally consistent results across frames is as important as achieving frame-wise accuracy.
Existing methods rely on optical flow regularization or fine-tuning with test data to attain temporal consistency.
This paper presents an efficient, intuitive, and unsupervised online adaptation method, AuxAdapt, for improving the temporal consistency of most neural network models.
arXiv Detail & Related papers (2021-10-24T07:07:41Z) - Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action
Localization [96.73647162960842]
TAL is a fundamental yet challenging task in video understanding.
Existing TAL methods rely on pre-training a video encoder through action classification supervision.
We introduce a novel low-fidelity end-to-end (LoFi) video encoder pre-training method.
arXiv Detail & Related papers (2021-03-28T22:18:14Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.