Enhancing Playback Performance in Video Recommender Systems with an On-Device Gating and Ranking Framework
- URL: http://arxiv.org/abs/2410.05863v1
- Date: Tue, 08 Oct 2024 09:53:10 GMT
- Title: Enhancing Playback Performance in Video Recommender Systems with an On-Device Gating and Ranking Framework
- Authors: Yunfei Yang, Zhenghao Qi, Honghuan Wu, Qi Song, Tieyao Zhang, Hao Li, Yimin Tu, Kaiqiao Zhan, Ben Wang,
- Abstract summary: We propose an on-device Gating and Ranking Framework (GRF) that cooperates with server-side video recommender systems (RSs)
Specifically, we utilize a gate model to identify videos that may have playback issues in real-time, and then we employ a ranking model to select the optimal result from a locally-cached pool to replace the stuttering videos.
Our solution has been fully deployed on Kwai, a large-scale short video platform with hundreds of millions of users globally.
- Score: 18.626416633423933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video recommender systems (RSs) have gained increasing attention in recent years. Existing mainstream RSs focus on optimizing the matching function between users and items. However, we noticed that users frequently encounter playback issues such as slow loading or stuttering while browsing the videos, especially in weak network conditions, which will lead to a subpar browsing experience, and may cause users to leave, even when the video content and recommendations are superior. It is quite a serious issue, yet easily overlooked. To tackle this issue, we propose an on-device Gating and Ranking Framework (GRF) that cooperates with server-side RS. Specifically, we utilize a gate model to identify videos that may have playback issues in real-time, and then we employ a ranking model to select the optimal result from a locally-cached pool to replace the stuttering videos. Our solution has been fully deployed on Kwai, a large-scale short video platform with hundreds of millions of users globally. Moreover, it significantly enhances video playback performance and improves overall user experience and retention rates.
Related papers
- Machine Learning-Based Prediction of Quality Shifts on Video Streaming Over 5G [0.0]
The Quality of Experience (QoE) is the users satisfaction while streaming a video session over an over-the-top (OTT) platform like YouTube.
We look into the relationship between quality shifting in YouTube streaming sessions and the channel metrics RSRP, RSRQ, and SNR.
arXiv Detail & Related papers (2025-04-24T21:00:43Z) - Streaming Video Question-Answering with In-context Video KV-Cache Retrieval [10.990431921021585]
We propose ReKV, a training-free approach that enables efficient streaming video question-answering (StreamingVQA)
Our approach analyzes long videos in a streaming manner, allowing for prompt responses as soon as user queries are received.
arXiv Detail & Related papers (2025-03-01T15:53:33Z) - Adaptive Caching for Faster Video Generation with Diffusion Transformers [52.73348147077075]
Diffusion Transformers (DiTs) rely on larger models and heavier attention mechanisms, resulting in slower inference speeds.
We introduce a training-free method to accelerate video DiTs, termed Adaptive Caching (AdaCache)
We also introduce a Motion Regularization (MoReg) scheme to utilize video information within AdaCache, controlling the compute allocation based on motion content.
arXiv Detail & Related papers (2024-11-04T18:59:44Z) - AIM 2024 Challenge on Efficient Video Super-Resolution for AV1 Compressed Content [56.552444900457395]
Video super-resolution (VSR) is a critical task for enhancing low-bitrate and low-resolution videos, particularly in streaming applications.
In this work, we compile different methods to address these challenges, the solutions are end-to-end real-time video super-resolution frameworks.
The proposed solutions tackle video up-scaling for two applications: 540p to 4K (x4) as a general case, and 360p to 1080p (x3) more tailored towards mobile devices.
arXiv Detail & Related papers (2024-09-25T18:12:19Z) - Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams [78.72965584414368]
We present Flash-VStream, a video-language model that simulates the memory mechanism of human.
Compared to existing models, Flash-VStream achieves significant reductions in latency inference and VRAM consumption.
We propose VStream-QA, a novel question answering benchmark specifically designed for online video streaming understanding.
arXiv Detail & Related papers (2024-06-12T11:07:55Z) - Enhancing User Interest based on Stream Clustering and Memory Networks in Large-Scale Recommender Systems [19.25041732650533]
User Interest Enhancement (UIE) enhances user interest including user profile and user history behavior sequences.
UIE not only remarkably improves model performance on the users with sparse interest but also significantly enhance model performance on other users.
arXiv Detail & Related papers (2024-05-21T22:53:00Z) - Real-Time Neural Video Recovery and Enhancement on Mobile Devices [15.343787475565836]
We present a novel approach for real-time video enhancement on mobile devices.
We have implemented our approach on an iPhone 12, and it can support 30 frames per second (FPS)
Our approach results in a significant increase in video QoE of 24% - 82% in our video streaming system.
arXiv Detail & Related papers (2023-07-22T19:52:04Z) - Deep Unsupervised Key Frame Extraction for Efficient Video
Classification [63.25852915237032]
This work presents an unsupervised method to retrieve the key frames, which combines Convolutional Neural Network (CNN) and Temporal Segment Density Peaks Clustering (TSDPC)
The proposed TSDPC is a generic and powerful framework and it has two advantages compared with previous works, one is that it can calculate the number of key frames automatically.
Furthermore, a Long Short-Term Memory network (LSTM) is added on the top of the CNN to further elevate the performance of classification.
arXiv Detail & Related papers (2022-11-12T20:45:35Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - Memory-Augmented Non-Local Attention for Video Super-Resolution [61.55700315062226]
We propose a novel video super-resolution method that aims at generating high-fidelity high-resolution (HR) videos from low-resolution (LR) ones.
Previous methods predominantly leverage temporal neighbor frames to assist the super-resolution of the current frame.
In contrast, we devise a cross-frame non-local attention mechanism that allows video super-resolution without frame alignment.
arXiv Detail & Related papers (2021-08-25T05:12:14Z) - NeuSaver: Neural Adaptive Power Consumption Optimization for Mobile
Video Streaming [3.3194866396158003]
NeuSaver applies an adaptive frame rate to each video chunk without compromising user experience.
NeuSaver generates an optimal policy that determines the appropriate frame rate for each video chunk.
NeuSaver effectively reduces the power consumption of mobile devices when streaming video by an average of 16.14% and up to 23.12% while achieving high QoE.
arXiv Detail & Related papers (2021-07-15T05:17:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.