Related papers: Deep Learning-Based Real-Time Quality Control of Standard Video Compression for Live Streaming

Deep Learning-Based Real-Time Quality Control of Standard Video Compression for Live Streaming

URL: http://arxiv.org/abs/2311.12918v1
Date: Tue, 21 Nov 2023 18:28:35 GMT
Title: Deep Learning-Based Real-Time Quality Control of Standard Video Compression for Live Streaming
Authors: Matin Mortaheb, Mohammad A. Amir Khojastepour, Srimat T. Chakradhar, Sennur Ulukus
Abstract summary: Real-time deep learning-based H.264 controller is proposed. It estimates optimal encoder parameters based on the content of a video chunk with minimal delay. It achieves improvements of up to 2.5 times in average bandwidth usage.
Score: 31.285983939625098
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ensuring high-quality video content for wireless users has become increasingly vital. Nevertheless, maintaining a consistent level of video quality faces challenges due to the fluctuating encoded bitrate, primarily caused by dynamic video content, especially in live streaming scenarios. Video compression is typically employed to eliminate unnecessary redundancies within and between video frames, thereby reducing the required bandwidth for video transmission. The encoded bitrate and the quality of the compressed video depend on encoder parameters, specifically, the quantization parameter (QP). Poor choices of encoder parameters can result in reduced bandwidth efficiency and high likelihood of non-conformance. Non-conformance refers to the violation of the peak signal-to-noise ratio (PSNR) constraint for an encoded video segment. To address these issues, a real-time deep learning-based H.264 controller is proposed. This controller dynamically estimates the optimal encoder parameters based on the content of a video chunk with minimal delay. The objective is to maintain video quality in terms of PSNR above a specified threshold while minimizing the average bitrate of the compressed video. Experimental results, conducted on both QCIF dataset and a diverse range of random videos from public datasets, validate the effectiveness of this approach. Notably, it achieves improvements of up to 2.5 times in average bandwidth usage compared to the state-of-the-art adaptive bitrate video streaming, with a negligible non-conformance probability below $10^{-2}$.

Related papers

Plug-and-Play Versatile Compressed Video Enhancement [57.62582951699999]
Video compression effectively reduces the size of files, making it possible for real-time cloud computing. However, it comes at the cost of visual quality, challenges the robustness of downstream vision models. We present a versatile-aware enhancement framework that adaptively enhance videos under different compression settings.
arXiv Detail & Related papers (2025-04-21T18:39:31Z)
Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
Currently, video transmission serves not only the Human Visual System (HVS) for viewing but also machine perception for analysis. We propose a Compression Distortion Embedding (CDRE) framework, which extracts machine-perception-related distortion representation and embeds it into downstream models. Our framework can effectively boost the rate-task performance of existing codecs with minimal overhead in terms of execution time, and number of parameters.
arXiv Detail & Related papers (2025-03-27T13:01:53Z)
RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression [68.31184784672227]
In modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems performing tasks. It is therefore useful to optimize the encoder for a downstream task instead of for image quality. Here, we address this challenge by controlling the Quantization Parameters (QPs) at the macro-block level to optimize the downstream task.
arXiv Detail & Related papers (2025-01-21T15:36:08Z)
When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [112.44822009714461]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding. During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes. Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z)
Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos. Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs. A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z)
A Parametric Rate-Distortion Model for Video Transcoding [7.1741986121107235]
We introduce a parametric rate-distortion (R-D) transcoder model. Our model excels at predicting distortion at various rates without the need for encoding the video. It can be used to achieve visual quality improvement (in terms of PSNR) via trans-sizing.
arXiv Detail & Related papers (2024-04-13T15:37:57Z)
NU-Class Net: A Novel Approach for Video Quality Enhancement [1.7763979745248648]
This paper introduces NU-Class Net, an innovative deep-learning model designed to mitigate compression artifacts stemming from lossy compression codecs. By employing the NU-Class Net, the video encoder within the video-capturing node can reduce output quality, thereby generating low-bit-rate videos. Experimental results affirm the efficacy of the proposed model in enhancing the perceptible quality of videos, especially those streamed at low bit rates.
arXiv Detail & Related papers (2024-01-02T11:46:42Z)
Deep Learning-Based Real-Time Rate Control for Live Streaming on Wireless Networks [31.285983939625098]
Suboptimal selection of encoder parameters can lead to video quality loss due to bandwidth or introduction of artifacts due to packet loss. A real-time deep learning based H.264 controller is proposed to dynamically estimate optimal encoder parameters with a negligible delay in real-time. Remarkably, improvements of 10-20 dB in PSNR with repect to the state-of-the-art adaptive video streaming is achieved, with an average packet drop rate as low as 0.002.
arXiv Detail & Related papers (2023-09-27T17:53:35Z)
Video Compression with Arbitrary Rescaling Network [8.489428003916622]
We propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding. The lightweight RARN structure can process FHD (1080p) content at real-time speed (91 FPS) and obtain a considerable rate reduction.
arXiv Detail & Related papers (2023-06-07T07:15:18Z)
Perceptual Quality Assessment of Face Video Compression: A Benchmark and An Effective Method [69.868145936998]
Generative coding approaches have been identified as promising alternatives with reasonable perceptual rate-distortion trade-offs. The great diversity of distortion types in spatial and temporal domains, ranging from the traditional hybrid coding frameworks to generative models, present grand challenges in compressed face video quality assessment (VQA) We introduce the large-scale Compressed Face Video Quality Assessment (CFVQA) database, which is the first attempt to systematically understand the perceptual quality and diversified compression distortions in face videos.
arXiv Detail & Related papers (2023-04-14T11:26:09Z)
Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos. We show that this improves restoration accuracy compared to prior compression correction methods. We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z)
Ultra-low bitrate video conferencing using deep image animation [7.263312285502382]
We propose a novel deep learning approach for ultra-low video compression for video conferencing applications. We employ deep neural networks to encode motion information as keypoint displacement and reconstruct the video signal at the decoder side.
arXiv Detail & Related papers (2020-12-01T09:06:34Z)
Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system. Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame. Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement [164.7489982837475]
We propose a Hierarchical Learned Video Compression (HLVC) method with three hierarchical quality layers and a recurrent enhancement network. In our HLVC approach, the hierarchical quality benefits the coding efficiency, since the high quality information facilitates the compression and enhancement of low quality frames at encoder and decoder sides.
arXiv Detail & Related papers (2020-03-04T09:31:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.