Related papers: Leveraging Compression to Construct Transferable Bitrate Ladders

Leveraging Compression to Construct Transferable Bitrate Ladders

URL: http://arxiv.org/abs/2512.12952v1
Date: Mon, 15 Dec 2025 03:38:26 GMT
Title: Leveraging Compression to Construct Transferable Bitrate Ladders
Authors: Krishna Srikar Durbha, Hassene Tmar, Ping-Hao Wu, Ioannis Katsavounidis, Alan C. Bovik,
Abstract summary: We present a new machine learning-based ladder construction technique that accurately predicts the VMAF scores of compressed videos.<n>We evaluate the performance of our proposed framework against leading prior methods on a large corpus of videos.
Score: 25.158228645127036
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Over the past few years, per-title and per-shot video encoding techniques have demonstrated significant gains as compared to conventional techniques such as constant CRF encoding and the fixed bitrate ladder. These techniques have demonstrated that constructing content-gnostic per-shot bitrate ladders can provide significant bitrate gains and improved Quality of Experience (QoE) for viewers under various network conditions. However, constructing a convex hull for every video incurs a significant computational overhead. Recently, machine learning-based bitrate ladder construction techniques have emerged as a substitute for convex hull construction. These methods operate by extracting features from source videos to train machine learning (ML) models to construct content-adaptive bitrate ladders. Here, we present a new ML-based bitrate ladder construction technique that accurately predicts the VMAF scores of compressed videos, by analyzing the compression procedure and by making perceptually relevant measurements on the source videos prior to compression. We evaluate the performance of our proposed framework against leading prior methods on a large corpus of videos. Since training ML models on every encoder setting is time-consuming, we also investigate how per-shot bitrate ladders perform under different encoding settings. We evaluate the performance of all models against the fixed bitrate ladder and the best possible convex hull constructed using exhaustive encoding with Bjontegaard-delta metrics.

Related papers

ProGIC: Progressive and Lightweight Generative Image Compression with Residual Vector Quantization [59.481950697968706]
We propose Progressive Generative Image Compression (ProGIC), a compact built on residual vector quantization (RVQ)<n>In RVQ, a sequence of vector quantizers encodes the residuals stage by stage, each with its own codebook.<n>We pair this with a lightweight backbone based on depthwise-separable convolutions and small attention blocks, enabling practical deployment on both GPU and CPU-only devices.
arXiv Detail & Related papers (2026-03-03T11:47:05Z)
SCALED : Surrogate-gradient for Codec-Aware Learning of Downsampling in ABR Streaming [9.436544348188598]
Over-the-Top (OTT) delivery now predominantly relies on Adaptive Bitrate (ABR) streaming.<n>Deep learning has spurred interest in jointly optimizing the ABR pipeline using learned resampling methods.<n>We introduce a novel framework that enables end-to-end training with real, non-differentiable codecs.
arXiv Detail & Related papers (2026-01-30T10:38:35Z)
Plug-and-Play Versatile Compressed Video Enhancement [57.62582951699999]
Video compression effectively reduces the size of files, making it possible for real-time cloud computing.<n>However, it comes at the cost of visual quality, challenges the robustness of downstream vision models.<n>We present a versatile-aware enhancement framework that adaptively enhance videos under different compression settings.
arXiv Detail & Related papers (2025-04-21T18:39:31Z)
Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
Currently, video transmission serves not only the Human Visual System (HVS) for viewing but also machine perception for analysis.<n>We propose a Compression Distortion Embedding (CDRE) framework, which extracts machine-perception-related distortion representation and embeds it into downstream models.<n>Our framework can effectively boost the rate-task performance of existing codecs with minimal overhead in terms of execution time, and number of parameters.
arXiv Detail & Related papers (2025-03-27T13:01:53Z)
High-Efficiency Neural Video Compression via Hierarchical Predictive Learning [27.41398149573729]
Enhanced Deep Hierarchical Video Compression-DHVC 2.0- introduces superior compression performance and impressive complexity efficiency. Uses hierarchical predictive coding to transform each video frame into multiscale representations. Supports transmission-friendly progressive decoding, making it particularly advantageous for networked video applications in the presence of packet loss.
arXiv Detail & Related papers (2024-10-03T15:40:58Z)
Optimal Transcoding Resolution Prediction for Efficient Per-Title Bitrate Ladder Estimation [9.332104035349932]
We demonstrate that content-optimized features and ladders can be efficiently determined without any pre-encoding. Our method well approximates the ground-truth-resolution pairs with a slight Bjontegaard Delta rate loss of 1.21%.
arXiv Detail & Related papers (2024-01-09T08:01:47Z)
Exploring Long- and Short-Range Temporal Information for Learned Video Compression [54.91301930491466]
We focus on exploiting the unique characteristics of video content and exploring temporal information to enhance compression performance. For long-range temporal information exploitation, we propose temporal prior that can update continuously within the group of pictures (GOP) during inference. In that case temporal prior contains valuable temporal information of all decoded images within the current GOP. In detail, we design a hierarchical structure to achieve multi-scale compensation.
arXiv Detail & Related papers (2022-08-07T15:57:18Z)
Convex Hull Prediction for Adaptive Video Streaming by Recurrent Learning [38.574550778712236]
We propose a deep learning based method of content aware convex hull prediction. We employ a recurrent convolutional network (RCN) to implicitly analyze the complexity of video shots in order to predict their convex hulls. Our proposed model better approximations of the optimal convex hulls, and offers competitive time savings as compared to existing approaches.
arXiv Detail & Related papers (2022-06-10T05:11:02Z)
A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs) The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved. We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z)
Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system. Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.