Related papers: Real-world Video Adaptation with Reinforcement Learning

Real-world Video Adaptation with Reinforcement Learning

URL: http://arxiv.org/abs/2008.12858v1
Date: Fri, 28 Aug 2020 21:44:24 GMT
Title: Real-world Video Adaptation with Reinforcement Learning
Authors: Hongzi Mao, Shannon Chen, Drew Dimmery, Shaun Singh, Drew Blaisdell, Yuandong Tian, Mohammad Alizadeh, Eytan Bakshy
Abstract summary: Client-side video players employ adaptive (ABR) algorithms to optimize user quality of experience (QoE) We evaluate recently proposed RL-based ABR methods in Facebook's web-based video streaming platform. In a week-long worldwide deployment with more than 30 million video streaming sessions, our RL approach outperforms the existing human-engineered ABR algorithms.
Score: 38.26695924173461
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Client-side video players employ adaptive bitrate (ABR) algorithms to optimize user quality of experience (QoE). We evaluate recently proposed RL-based ABR methods in Facebook's web-based video streaming platform. Real-world ABR contains several challenges that requires customized designs beyond off-the-shelf RL algorithms -- we implement a scalable neural network architecture that supports videos with arbitrary bitrate encodings; we design a training method to cope with the variance resulting from the stochasticity in network conditions; and we leverage constrained Bayesian optimization for reward shaping in order to optimize the conflicting QoE objectives. In a week-long worldwide deployment with more than 30 million video streaming sessions, our RL approach outperforms the existing human-engineered ABR algorithms.

Related papers

Percentile-Based Deep Reinforcement Learning and Reward Based Personalization For Delay Aware RAN Slicing in O-RAN [0.0]
We tackle the challenge of radio access network slicing within an open RAN (O-RAN) architecture.<n>Our focus centers on a network that includes multiple mobile virtual network operators (MVNOs) competing for physical resource blocks.<n>We introduce a reward-based personalization method where each agent prioritizes other agents' model weights based on their performance.
arXiv Detail & Related papers (2025-07-24T05:45:41Z)
ViaRL: Adaptive Temporal Grounding via Visual Iterated Amplification Reinforcement Learning [68.76048244253582]
We introduce ViaRL, the first framework to leverage rule-based reinforcement learning (RL) for optimizing frame selection in video understanding.<n>ViaRL utilizes the answer accuracy of a downstream model as a reward signal to train a frame selector through trial-and-error.<n>ViaRL consistently delivers superior temporal grounding performance and robust generalization across diverse video understanding tasks.
arXiv Detail & Related papers (2025-05-21T12:29:40Z)
Streaming Looking Ahead with Token-level Self-reward [50.699168440048716]
We propose a policy model with token-level self-reward modeling (TRM) capability to eliminate the need for external models and extra communication. In addition, we propose a streaming-looking-ahead (SLA) algorithm to further boost search efficiency with better parallelization. If we combine SLA with reinforcement fine-tuning techniques such as DPO, SLA achieves an overall win rate of 89.4%.
arXiv Detail & Related papers (2025-02-24T22:35:53Z)
RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression [68.31184784672227]
In modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems performing tasks. It is therefore useful to optimize the encoder for a downstream task instead of for image quality. Here, we address this challenge by controlling the Quantization Parameters (QPs) at the macro-block level to optimize the downstream task.
arXiv Detail & Related papers (2025-01-21T15:36:08Z)
EPS: Efficient Patch Sampling for Video Overfitting in Deep Super-Resolution Model Training [15.684865589513597]
We propose an efficient patch sampling method named EPS for video SR network overfitting. Our method reduces the number of patches for the training to 4% to 25%, depending on the resolution and number of clusters. Compared to the state-of-the-art patch sampling method, EMT, our approach achieves an 83% decrease in overall run time.
arXiv Detail & Related papers (2024-11-25T12:01:57Z)
Reinforcement Learning -based Adaptation and Scheduling Methods for Multi-source DASH [1.1971219484941955]
Dynamic adaptive streaming over HTTP (DASH) has been widely used in video streaming recently. In multi-source streaming, video chunks may arrive out of order due to different conditions of the network paths. This paper proposes two algorithms for streaming from multiple sources: RL-based adaptation with greedy scheduling (RLAGS) and RL-based adaptation and scheduling (RLAS)
arXiv Detail & Related papers (2023-07-25T06:47:12Z)
Predictive GAN-powered Multi-Objective Optimization for Hybrid Federated Split Learning [56.125720497163684]
We propose a hybrid federated split learning framework in wireless networks. We design a parallel computing scheme for model splitting without label sharing, and theoretically analyze the influence of the delayed gradient caused by the scheme on the convergence speed.
arXiv Detail & Related papers (2022-09-02T10:29:56Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Cross Layer Optimization and Distributed Reinforcement Learning for Wireless 360° Video Streaming [54.60967639512643]
We propose a cross layer optimization approach that maximizes the available rate to each user and efficiently uses it to maximize users' QoE. We show that the problem can be decoupled into two interrelated subproblems. An actor-critic deep reinforcement learning (DRL) is proposed to leverage the parallel training of multiple independent agents and solve the application layer subproblem.
arXiv Detail & Related papers (2020-11-12T12:59:10Z)
NANCY: Neural Adaptive Network Coding methodologY for video distribution over wireless networks [1.636104578028594]
NANCY is a system that generates adaptive bit rates (ABR) for video and adaptive network coding rates (ANCR) NANCY trains a neural network model with rewards formulated as quality of experience (QoE) metrics. Our results show that NANCY provides 29.91% and 60.34% higher average QoE than Pensieve and robustMPC, respectively.
arXiv Detail & Related papers (2020-08-21T15:55:32Z)
Meta-Reinforcement Learning for Trajectory Design in Wireless UAV Networks [151.65541208130995]
A drone base station (DBS) is dispatched to provide uplink connectivity to ground users whose demand is dynamic and unpredictable. In this case, the DBS's trajectory must be adaptively adjusted to satisfy the dynamic user access requests. A meta-learning algorithm is proposed in order to adapt the DBS's trajectory when it encounters novel environments.
arXiv Detail & Related papers (2020-05-25T20:43:59Z)
AOWS: Adaptive and optimal network width search with latency constraints [30.39613826468697]
We introduce a novel efficient one-shot NAS approach to optimally search for channel numbers. Experiments on ImageNet classification show that our approach can find networks fitting the resource constraints on different target platforms.
arXiv Detail & Related papers (2020-05-21T06:46:16Z)
Non-Cooperative Game Theory Based Rate Adaptation for Dynamic Video Streaming over HTTP [89.30855958779425]
Dynamic Adaptive Streaming over HTTP (DASH) has demonstrated to be an emerging and promising multimedia streaming technique. We propose a novel algorithm to optimally allocate the limited export bandwidth of the server to multi-users to maximize their Quality of Experience (QoE) with fairness guaranteed.
arXiv Detail & Related papers (2019-12-27T01:19:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.