Related papers: Real-Time Neural-Enhancement for Online Cloud Gaming

Real-Time Neural-Enhancement for Online Cloud Gaming

URL: http://arxiv.org/abs/2501.06880v1
Date: Sun, 12 Jan 2025 17:28:09 GMT
Title: Real-Time Neural-Enhancement for Online Cloud Gaming
Authors: Shan Jiang, Zhenhua Han, Haisheng Tan, Xinyang Jiang, Yifan Yang, Xiaoxi Zhang, Hongqiu Ni, Yuqing Yang, Xiang-Yang Li,
Abstract summary: We introduce River, a cloud gaming delivery framework based on the observation that video segment features in cloud gaming are typically repetitive and redundant.<n>River builds a content-aware encoder that fine-tunes SR models for diverse video segments and stores them in a lookup table.<n>When delivering cloud gaming video streams online, River checks the video features and retrieves the most relevant SR models to enhance the frame quality.
Score: 31.971805571638942
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online Cloud gaming demands real-time, high-quality video transmission across variable wide-area networks (WANs). Neural-enhanced video transmission algorithms employing super-resolution (SR) for video quality enhancement have effectively challenged WAN environments. However, these SR-based methods require intensive fine-tuning for the whole video, making it infeasible in diverse online cloud gaming. To address this, we introduce River, a cloud gaming delivery framework designed based on the observation that video segment features in cloud gaming are typically repetitive and redundant. This permits a significant opportunity to reuse fine-tuned SR models, reducing the fine-tuning latency of minutes to query latency of milliseconds. To enable the idea, we design a practical system that addresses several challenges, such as model organization, online model scheduler, and transfer strategy. River first builds a content-aware encoder that fine-tunes SR models for diverse video segments and stores them in a lookup table. When delivering cloud gaming video streams online, River checks the video features and retrieves the most relevant SR models to enhance the frame quality. Meanwhile, if no existing SR model performs well enough for some video segments, River will further fine-tune new models and update the lookup table. Finally, to avoid the overhead of streaming model weight to the clients, River designs a prefetching strategy that predicts the models with the highest possibility of being retrieved. Our evaluation based on real video game streaming demonstrates River can reduce redundant training overhead by 44% and improve the Peak-Signal-to-Noise-Ratio by 1.81dB compared to the SOTA solutions. Practical deployment shows River meets real-time requirements, achieving approximately 720p 20fps on mobile devices.

Related papers

Rethinking Video Tokenization: A Conditioned Diffusion-based Approach [58.164354605550194]
New tokenizer, Diffusion Conditioned-based Gene Tokenizer, replaces GAN-based decoder with conditional diffusion model. We trained using only a basic MSE diffusion loss for reconstruction, along with KL term and LPIPS perceptual loss from scratch. Even a scaled-down version of CDT (3$times inference speedup) still performs comparably with top baselines.
arXiv Detail & Related papers (2025-03-05T17:59:19Z)
RAIN: Real-time Animation of Infinite Video Stream [52.97171098038888]
RAIN is a pipeline solution capable of animating infinite video streams in real-time with low latency.<n>RAIN generates video frames with much shorter latency and faster speed, while maintaining long-range attention over extended video streams.<n>RAIN can animate characters in real-time with much better quality, accuracy, and consistency than competitors.
arXiv Detail & Related papers (2024-12-27T07:13:15Z)
SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device [61.42406720183769]
We propose a comprehensive acceleration framework to bring the power of the large-scale video diffusion model to the hands of edge users.<n>Our model, with only 0.6B parameters, can generate a 5-second video on an iPhone 16 PM within 5 seconds.
arXiv Detail & Related papers (2024-12-13T18:59:56Z)
Data Overfitting for On-Device Super-Resolution with Dynamic Algorithm and Compiler Co-Design [18.57172631588624]
We propose a Dynamic Deep neural network assisted by a Content-Aware data processing pipeline to reduce the model number down to one. Our method achieves better PSNR and real-time performance (33 FPS) on an off-the-shelf mobile phone.
arXiv Detail & Related papers (2024-07-03T05:17:26Z)
IDF-CR: Iterative Diffusion Process for Divide-and-Conquer Cloud Removal in Remote-sensing Images [55.40601468843028]
We present an iterative diffusion process for cloud removal (IDF-CR) IDF-CR is divided into two-stage models that address pixel space and latent space. In the latent space stage, the diffusion model transforms low-quality cloud removal into high-quality clean output.
arXiv Detail & Related papers (2024-03-18T15:23:48Z)
Enabling Real-time Neural Recovery for Cloud Gaming on Mobile Devices [11.530719133935847]
We propose a new method for recovering lost or corrupted video frames in cloud gaming. Unlike traditional video frame recovery, our approach uses game states to significantly enhance recovery accuracy. We develop a holistic system that consists of (i) efficiently extracting game states, (ii) modifying H.264 video decoder to generate a mask to indicate which portions of video frames need recovery, and (iii) designing a novel neural network to recover either complete or partial video frames.
arXiv Detail & Related papers (2023-07-15T16:45:01Z)
GAMIVAL: Video Quality Prediction on Mobile Cloud Gaming Content [30.96557290048384]
We develop a new gaming-specific NR VQA model called the Gaming Video Quality Evaluator (GAMIVAL) Using a support vector regression (SVR) as a regressor, GAMIVAL achieves superior performance on the new LIVE-Meta Mobile Cloud Gaming (LIVE-Meta MCG) video quality database.
arXiv Detail & Related papers (2023-05-03T20:29:04Z)
Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos [69.22032459870242]
We present a novel technique, Residual Radiance Field or ReRF, as a highly compact neural representation to achieve real-time free-view rendering on long-duration dynamic scenes. We show such a strategy can handle large motions without sacrificing quality. Based on ReRF, we design a special FVV that achieves three orders of magnitudes compression rate and provides a companion ReRF player to support online streaming of long-duration FVVs of dynamic scenes.
arXiv Detail & Related papers (2023-04-10T08:36:00Z)
ReBotNet: Fast Real-time Video Enhancement [59.08038313427057]
Most restoration networks are slow, have high computational bottleneck, and can't be used for real-time video enhancement. In this work, we design an efficient and fast framework to perform real-time enhancement for practical use-cases like live video calls and video streams. To evaluate our method, we emulate two new datasets that real-world video call and streaming scenarios, and show extensive results on multiple datasets where ReBotNet outperforms existing approaches with lower computations, reduced memory requirements, and faster inference time.
arXiv Detail & Related papers (2023-03-23T17:58:05Z)
Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting [27.302681897961588]
Deep convolutional neural networks (DNNs) are widely used in various fields of computer vision. We propose a novel method for high-quality and efficient video resolution upscaling tasks. We deploy our models on an off-the-shelf mobile phone, and experimental results show that our method achieves real-time video super-resolution with high video quality.
arXiv Detail & Related papers (2023-03-15T02:40:02Z)
A Serverless Cloud-Fog Platform for DNN-Based Video Analytics with Incremental Learning [31.712746462418693]
This paper presents the first serverless system that takes full advantage of the client-fog-cloud synergy to better serve the DNN-based video analytics. To this end, we implement a holistic cloud-fog system referred to as V (Video-Platform-as-a-Service) The evaluation demonstrates that V is superior to several SOTA systems: it maintains high accuracy while reducing bandwidth usage by up to 21%, RTT by up to 62.5%, and cloud monetary cost by up to 50%.
arXiv Detail & Related papers (2021-02-05T05:59:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.