A Parametric Rate-Distortion Model for Video Transcoding
- URL: http://arxiv.org/abs/2404.09029v1
- Date: Sat, 13 Apr 2024 15:37:57 GMT
- Title: A Parametric Rate-Distortion Model for Video Transcoding
- Authors: Maedeh Jamali, Nader Karimi, Shadrokh Samavi, Shahram Shirani,
- Abstract summary: We introduce a parametric rate-distortion (R-D) transcoder model.
Our model excels at predicting distortion at various rates without the need for encoding the video.
It can be used to achieve visual quality improvement (in terms of PSNR) via trans-sizing.
- Score: 7.1741986121107235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past two decades, the surge in video streaming applications has been fueled by the increasing accessibility of the internet and the growing demand for network video. As users with varying internet speeds and devices seek high-quality video, transcoding becomes essential for service providers. In this paper, we introduce a parametric rate-distortion (R-D) transcoding model. Our model excels at predicting transcoding distortion at various rates without the need for encoding the video. This model serves as a versatile tool that can be used to achieve visual quality improvement (in terms of PSNR) via trans-sizing. Moreover, we use our model to identify visually lossless and near-zero-slope bitrate ranges for an ingest video. Having this information allows us to adjust the transcoding target bitrate while introducing visually negligible quality degradations. By utilizing our model in this manner, quality improvements up to 2 dB and bitrate savings of up to 46% of the original target bitrate are possible. Experimental results demonstrate the efficacy of our model in video transcoding rate distortion prediction.
Related papers
- Adaptive Caching for Faster Video Generation with Diffusion Transformers [52.73348147077075]
Diffusion Transformers (DiTs) rely on larger models and heavier attention mechanisms, resulting in slower inference speeds.
We introduce a training-free method to accelerate video DiTs, termed Adaptive Caching (AdaCache)
We also introduce a Motion Regularization (MoReg) scheme to utilize video information within AdaCache, controlling the compute allocation based on motion content.
arXiv Detail & Related papers (2024-11-04T18:59:44Z) - Prediction and Reference Quality Adaptation for Learned Video Compression [54.58691829087094]
We propose a confidence-based prediction quality adaptation (PQA) module to provide explicit discrimination for the spatial and channel-wise prediction quality difference.
We also propose a reference quality adaptation (RQA) module and an associated repeat-long training strategy to provide dynamic spatially variant filters for diverse reference qualities.
arXiv Detail & Related papers (2024-06-20T09:03:26Z) - Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching [51.70360630470263]
Video-to-audio (V2A) generation aims to synthesize content-matching audio from silent video.
We propose Frieren, a V2A model based on rectified flow matching.
Experiments indicate that Frieren achieves state-of-the-art performance in both generation quality and temporal alignment.
arXiv Detail & Related papers (2024-06-01T06:40:22Z) - Deep Learning-Based Real-Time Quality Control of Standard Video
Compression for Live Streaming [31.285983939625098]
Real-time deep learning-based H.264 controller is proposed.
It estimates optimal encoder parameters based on the content of a video chunk with minimal delay.
It achieves improvements of up to 2.5 times in average bandwidth usage.
arXiv Detail & Related papers (2023-11-21T18:28:35Z) - Deep Learning-Based Real-Time Rate Control for Live Streaming on
Wireless Networks [31.285983939625098]
Suboptimal selection of encoder parameters can lead to video quality loss due to bandwidth or introduction of artifacts due to packet loss.
A real-time deep learning based H.264 controller is proposed to dynamically estimate optimal encoder parameters with a negligible delay in real-time.
Remarkably, improvements of 10-20 dB in PSNR with repect to the state-of-the-art adaptive video streaming is achieved, with an average packet drop rate as low as 0.002.
arXiv Detail & Related papers (2023-09-27T17:53:35Z) - Video Compression with Arbitrary Rescaling Network [8.489428003916622]
We propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding.
The lightweight RARN structure can process FHD (1080p) content at real-time speed (91 FPS) and obtain a considerable rate reduction.
arXiv Detail & Related papers (2023-06-07T07:15:18Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - Instance-Adaptive Video Compression: Improving Neural Codecs by Training
on the Test Set [14.89208053104896]
We introduce a video compression algorithm based on instance-adaptive learning.
On each video sequence to be transmitted, we finetune a pretrained compression model.
We show that it enables a competitive performance even after reducing the network size by 70%.
arXiv Detail & Related papers (2021-11-19T16:25:34Z) - Ultra-low bitrate video conferencing using deep image animation [7.263312285502382]
We propose a novel deep learning approach for ultra-low video compression for video conferencing applications.
We employ deep neural networks to encode motion information as keypoint displacement and reconstruct the video signal at the decoder side.
arXiv Detail & Related papers (2020-12-01T09:06:34Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z) - An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond
Feature and Signal [99.49099501559652]
Video Coding for Machine (VCM) aims to bridge the gap between visual feature compression and classical video coding.
We employ a conditional deep generation network to reconstruct video frames with the guidance of learned motion pattern.
By learning to extract sparse motion pattern via a predictive model, the network elegantly leverages the feature representation to generate the appearance of to-be-coded frames.
arXiv Detail & Related papers (2020-01-09T14:18:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.