Semantic-Aware Adaptive Video Streaming Using Latent Diffusion Models for Wireless Networks
- URL: http://arxiv.org/abs/2502.05695v1
- Date: Sat, 08 Feb 2025 21:14:28 GMT
- Title: Semantic-Aware Adaptive Video Streaming Using Latent Diffusion Models for Wireless Networks
- Authors: Zijiang Yan, Jianhua Pei, Hongda Wu, Hina Tabassum, Ping Wang,
- Abstract summary: This paper proposes a novel framework for real-time adaptive-bitrate video streaming by integrating latent diffusion models (LDMs) within the FF techniques.
It addresses the challenges of high bandwidth usage, storage inefficiencies, and quality of experience (QoE) associated with traditional constant streaming (CBS) and adaptive streaming (ABS)
This work opens new scalable solutions for real-time video streaming in 5G and future post-5G networks.
- Score: 12.180483357502293
- License:
- Abstract: This paper proposes a novel framework for real-time adaptive-bitrate video streaming by integrating latent diffusion models (LDMs) within the FFmpeg techniques. This solution addresses the challenges of high bandwidth usage, storage inefficiencies, and quality of experience (QoE) degradation associated with traditional constant bitrate streaming (CBS) and adaptive bitrate streaming (ABS). The proposed approach leverages LDMs to compress I-frames into a latent space, offering significant storage and semantic transmission savings without sacrificing high visual quality. While it keeps B-frames and P-frames as adjustment metadata to ensure efficient video reconstruction at the user side, the proposed framework is complemented with the most state-of-the-art denoising and video frame interpolation (VFI) techniques. These techniques mitigate semantic ambiguity and restore temporal coherence between frames, even in noisy wireless communication environments. Experimental results demonstrate the proposed method achieves high-quality video streaming with optimized bandwidth usage, outperforming state-of-the-art solutions in terms of QoE and resource efficiency. This work opens new possibilities for scalable real-time video streaming in 5G and future post-5G networks.
Related papers
- CANeRV: Content Adaptive Neural Representation for Video Compression [89.35616046528624]
We propose Content Adaptive Neural Representation for Video Compression (CANeRV)
CANeRV is an innovative INR-based video compression network that adaptively conducts structure optimisation based on the specific content of each video sequence.
We show that CANeRV can outperform both H.266/VVC and state-of-the-art INR-based video compression techniques across diverse video datasets.
arXiv Detail & Related papers (2025-02-10T06:21:16Z) - DeformStream: Deformation-based Adaptive Volumetric Video Streaming [4.366356163044466]
Volumetric video streaming offers immersive 3D experiences but faces significant challenges due to high bandwidth requirements and latency issues.
We introduce Deformation-based Adaptive Volumetric Video Streaming, a novel framework that enhances volumetric video streaming performance by leveraging the inherent deformability of mesh-based representations.
arXiv Detail & Related papers (2024-09-25T04:43:59Z) - Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World
Video Super-Resolution [65.91317390645163]
Upscale-A-Video is a text-guided latent diffusion framework for video upscaling.
It ensures temporal coherence through two key mechanisms: locally, it integrates temporal layers into U-Net and VAE-Decoder, maintaining consistency within short sequences.
It also offers greater flexibility by allowing text prompts to guide texture creation and adjustable noise levels to balance restoration and generation.
arXiv Detail & Related papers (2023-12-11T18:54:52Z) - Enhanced adaptive cross-layer scheme for low latency HEVC streaming over
Vehicular Ad-hoc Networks (VANETs) [2.2124180701409233]
HEVC is very promising for real-time video streaming through Vehicular Ad-hoc Networks (VANET)
A low complexity cross-layer mechanism is proposed to improve end-to-end performances of HEVC video streaming in VANET under low delay constraints.
The proposed mechanism offers significant improvements regarding video quality at the reception and end-to-end delay compared to the Enhanced Distributed Channel Access (EDCA) adopted in the 802.11p.
arXiv Detail & Related papers (2023-11-05T14:19:38Z) - Online Streaming Video Super-Resolution with Convolutional Look-Up Table [26.628925884353674]
This paper focuses on the rarely exploited problem setting of online streaming video super resolution.
New benchmark dataset named LDV-WebRTC is constructed based on a real-world online streaming system.
We propose a mixture-of-expert-LUT module, where a set of LUT specialized in different degradations are built and adaptively combined to handle different degradations.
arXiv Detail & Related papers (2023-03-01T08:54:56Z) - Wireless Deep Video Semantic Transmission [14.071114007641313]
We propose a new class of high-efficiency deep joint source-channel coding methods to achieve end-to-end video transmission over wireless channels.
Our framework is collected under the name deep video semantic transmission (DVST)
arXiv Detail & Related papers (2022-05-26T03:26:43Z) - DeepWiVe: Deep-Learning-Aided Wireless Video Transmission [0.0]
We present DeepWiVe, the first-ever end-to-end joint source-channel coding (JSCC) video transmission scheme.
We use deep neural networks (DNNs) to map video signals to channel symbols, combining video compression, channel coding, and modulation steps into a single neural transform.
Our results show that DeepWiVe can overcome the cliff-effect, which is prevalent in conventional separation-based digital communication schemes.
arXiv Detail & Related papers (2021-11-25T11:34:24Z) - Optical-Flow-Reuse-Based Bidirectional Recurrent Network for Space-Time
Video Super-Resolution [52.899234731501075]
Space-time video super-resolution (ST-VSR) simultaneously increases the spatial resolution and frame rate for a given video.
Existing methods typically suffer from difficulties in how to efficiently leverage information from a large range of neighboring frames.
We propose a coarse-to-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames.
arXiv Detail & Related papers (2021-10-13T15:21:30Z) - Capturing Video Frame Rate Variations via Entropic Differencing [63.749184706461826]
We propose a novel statistical entropic differencing method based on a Generalized Gaussian Distribution model.
Our proposed model correlates very well with subjective scores in the recently proposed LIVE-YT-HFR database.
arXiv Detail & Related papers (2020-06-19T22:16:52Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z) - Non-Cooperative Game Theory Based Rate Adaptation for Dynamic Video
Streaming over HTTP [89.30855958779425]
Dynamic Adaptive Streaming over HTTP (DASH) has demonstrated to be an emerging and promising multimedia streaming technique.
We propose a novel algorithm to optimally allocate the limited export bandwidth of the server to multi-users to maximize their Quality of Experience (QoE) with fairness guaranteed.
arXiv Detail & Related papers (2019-12-27T01:19:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.