Related papers: Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation

Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation

URL: http://arxiv.org/abs/2401.04405v1
Date: Tue, 9 Jan 2024 08:01:47 GMT
Title: Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation
Authors: Jinhai Yang, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang
Abstract summary: We demonstrate that content-optimized features and ladders can be efficiently determined without any pre-encoding. Our method well approximates the ground-truth-resolution pairs with a slight Bjontegaard Delta rate loss of 1.21%.
Score: 9.332104035349932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adaptive video streaming requires efficient bitrate ladder construction to meet heterogeneous network conditions and end-user demands. Per-title optimized encoding typically traverses numerous encoding parameters to search the Pareto-optimal operating points for each video. Recently, researchers have attempted to predict the content-optimized bitrate ladder for pre-encoding overhead reduction. However, existing methods commonly estimate the encoding parameters on the Pareto front and still require subsequent pre-encodings. In this paper, we propose to directly predict the optimal transcoding resolution at each preset bitrate for efficient bitrate ladder construction. We adopt a Temporal Attentive Gated Recurrent Network to capture spatial-temporal features and predict transcoding resolutions as a multi-task classification problem. We demonstrate that content-optimized bitrate ladders can thus be efficiently determined without any pre-encoding. Our method well approximates the ground-truth bitrate-resolution pairs with a slight Bj{\o}ntegaard Delta rate loss of 1.21% and significantly outperforms the state-of-the-art fixed ladder.

Related papers

Leveraging Compression to Construct Transferable Bitrate Ladders [25.158228645127036]
We present a new machine learning-based ladder construction technique that accurately predicts the VMAF scores of compressed videos.<n>We evaluate the performance of our proposed framework against leading prior methods on a large corpus of videos.
arXiv Detail & Related papers (2025-12-15T03:38:26Z)
Adaptive High-Frequency Preprocessing for Video Coding [9.492217153689428]
High-frequency components are crucial for maintaining video clarity and realism, but they also significantly impact coding, resulting in increased bandwidth and storage costs.<n>This paper presents an end-to-end learning-based framework for adaptive high-frequency preprocessing to enhance subjective quality and save in video coding.
arXiv Detail & Related papers (2025-08-12T11:16:02Z)
RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression [68.31184784672227]
In modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems performing tasks. It is therefore useful to optimize the encoder for a downstream task instead of for image quality. Here, we address this challenge by controlling the Quantization Parameters (QPs) at the macro-block level to optimize the downstream task.
arXiv Detail & Related papers (2025-01-21T15:36:08Z)
Prediction and Reference Quality Adaptation for Learned Video Compression [54.58691829087094]
We propose a confidence-based prediction quality adaptation (PQA) module to provide explicit discrimination for the spatial and channel-wise prediction quality difference. We also propose a reference quality adaptation (RQA) module and an associated repeat-long training strategy to provide dynamic spatially variant filters for diverse reference qualities.
arXiv Detail & Related papers (2024-06-20T09:03:26Z)
A Parametric Rate-Distortion Model for Video Transcoding [7.1741986121107235]
We introduce a parametric rate-distortion (R-D) transcoder model. Our model excels at predicting distortion at various rates without the need for encoding the video. It can be used to achieve visual quality improvement (in terms of PSNR) via trans-sizing.
arXiv Detail & Related papers (2024-04-13T15:37:57Z)
AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution [53.23803932357899]
We introduce the first on-the-fly adaptive quantization framework that accelerates the processing time from hours to seconds. We achieve competitive performance with the previous adaptive quantization methods, while the processing time is accelerated by x2000.
arXiv Detail & Related papers (2024-04-04T08:37:27Z)
Efficient Bitrate Ladder Construction using Transfer Learning and Spatio-Temporal Features [12.631821085716853]
We propose an efficient ladder prediction method using transfer and learning features. The method tested on 102 video scenes demonstrates 94.1% reduction in complexity versus brute-force at 1.71% BD-Rate expense.
arXiv Detail & Related papers (2024-01-06T11:37:20Z)
Towards Real-Time Neural Video Codec for Cross-Platform Application Using Calibration Information [17.141950680993617]
Cross-platform computational errors resulting from floating point operations can lead to inaccurate decoding of the bitstream. The high computational complexity of the encoding and decoding process poses a challenge in achieving real-time performance. A real-time cross-platform neural video is capable of efficiently decoding of 720P video bitstream from other encoding platforms on a consumer-grade GPU.
arXiv Detail & Related papers (2023-09-20T13:01:15Z)
Dynamic Low-Rank Instance Adaptation for Universal Neural Image Compression [33.92792778925365]
We propose a low-rank adaptation approach to address the rate-distortion drop observed in out-of-domain datasets. Our proposed method exhibits universality across diverse image datasets.
arXiv Detail & Related papers (2023-08-15T12:17:46Z)
Convex Hull Prediction for Adaptive Video Streaming by Recurrent Learning [38.574550778712236]
We propose a deep learning based method of content aware convex hull prediction. We employ a recurrent convolutional network (RCN) to implicitly analyze the complexity of video shots in order to predict their convex hulls. Our proposed model better approximations of the optimal convex hulls, and offers competitive time savings as compared to existing approaches.
arXiv Detail & Related papers (2022-06-10T05:11:02Z)
Efficient VVC Intra Prediction Based on Deep Feature Fusion and Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation. Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z)
AuxAdapt: Stable and Efficient Test-Time Adaptation for Temporally Consistent Video Semantic Segmentation [81.87943324048756]
In video segmentation, generating temporally consistent results across frames is as important as achieving frame-wise accuracy. Existing methods rely on optical flow regularization or fine-tuning with test data to attain temporal consistency. This paper presents an efficient, intuitive, and unsupervised online adaptation method, AuxAdapt, for improving the temporal consistency of most neural network models.
arXiv Detail & Related papers (2021-10-24T07:07:41Z)
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization [96.73647162960842]
TAL is a fundamental yet challenging task in video understanding. Existing TAL methods rely on pre-training a video encoder through action classification supervision. We introduce a novel low-fidelity end-to-end (LoFi) video encoder pre-training method.
arXiv Detail & Related papers (2021-03-28T22:18:14Z)
Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system. Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.