Related papers: qAttCNN - Self Attention Mechanism for Video QoE Prediction in Encrypted Traffic

qAttCNN - Self Attention Mechanism for Video QoE Prediction in Encrypted Traffic

URL: http://arxiv.org/abs/2601.06862v1
Date: Sun, 11 Jan 2026 11:08:40 GMT
Title: qAttCNN - Self Attention Mechanism for Video QoE Prediction in Encrypted Traffic
Authors: Michael Sidorov, Ofer Hadar,
Abstract summary: Video conferencing applications (VCAs) and instant messaging applications (IMAs) like WhatsApp and Telegram increasingly support video conferencing as a core feature.<n>End-to-end encryption, commonly used by modern VCAs and IMAs, prevent ISPs from accessing the original media stream.<n>We propose the QoE Attention Convolutional Neural Network (qAttCNN) to infer two no-reference QoE metrics viz. BRISQUE and frames per second (FPS)<n>We evaluate qAttCNN on a custom dataset collected from WhatsApp video calls and compare it against existing QoE models.
Score: 2.4851388650413866
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid growth of multimedia consumption, driven by major advances in mobile devices since the mid-2000s, has led to widespread use of video conferencing applications (VCAs) such as Zoom and Google Meet, as well as instant messaging applications (IMAs) like WhatsApp and Telegram, which increasingly support video conferencing as a core feature. Many of these systems rely on the Web Real-Time Communication (WebRTC) protocol, enabling direct peer-to-peer media streaming without requiring a third-party server to relay data, reducing the latency and facilitating a real-time communication. Despite WebRTC's potential, adverse network conditions can degrade streaming quality and consequently reduce users' Quality of Experience (QoE). Maintaining high QoE therefore requires continuous monitoring and timely intervention when QoE begins to deteriorate. While content providers can often estimate QoE by directly comparing transmitted and received media, this task is significantly more challenging for internet service providers (ISPs). End-to-end encryption, commonly used by modern VCAs and IMAs, prevent ISPs from accessing the original media stream, leaving only Quality of Service (QoS) and routing information available. To address this limitation, we propose the QoE Attention Convolutional Neural Network (qAttCNN), a model that leverages packet size parameter of the traffic to infer two no-reference QoE metrics viz. BRISQUE and frames per second (FPS). We evaluate qAttCNN on a custom dataset collected from WhatsApp video calls and compare it against existing QoE models. Using mean absolute error percentage (MAEP), our approach achieves 2.14% error for BRISQUE and 7.39% for FPS prediction.

Related papers

VineetVC: Adaptive Video Conferencing Under Severe Bandwidth Constraints Using Audio-Driven Talking-Head Reconstruction [0.0]
Intense bandwidth depletion within consumer and constrained networks has the potential to undermine the stability of real-time video conferencing.<n>This work delineates an adaptive conferencing system that integrates media delivery with a supplementary audio-driven talking-head reconstruction pathway.
arXiv Detail & Related papers (2026-02-13T09:37:10Z)
Large Speech Model Enabled Semantic Communication [58.027223937172955]
Large Speech Model enabled Semantic Communication (LargeSC) system.<n>We exploit the rich semantic knowledge embedded in large models and enable adaptive transmission over lossy channels.<n>System supports bandwidths ranging from 550 bps to 2.06 kbps, outperforms conventional baselines in speech quality under high packet loss rates.
arXiv Detail & Related papers (2025-12-04T11:58:08Z)
Video QoE Metrics from Encrypted Traffic: Application-agnostic Methodology [2.7123995549185325]
We propose an application-agnostic approach for objective QoE estimation from encrypted traffic.<n>We obtained key video QoE metrics, enabling broad applicability to various proprietary IMVCAs and VCAs.<n>Our evaluation shows high performance across the entire dataset, with 85.2% accuracy for FPS predictions within an error margin of two FPS, and 90.2% accuracy for PIQE-based quality rating classification.
arXiv Detail & Related papers (2025-04-20T19:18:13Z)
AI Flow at the Network Edge [58.31090055138711]
AI Flow is a framework that streamlines the inference process by jointly leveraging the heterogeneous resources available across devices, edge nodes, and cloud servers.<n>This article serves as a position paper for identifying the motivation, challenges, and principles of AI Flow.
arXiv Detail & Related papers (2024-11-19T12:51:17Z)
Satellite Streaming Video QoE Prediction: A Real-World Subjective Database and Network-Level Prediction Models [59.061552498630874]
We introduce the LIVE-Viasat Real-World Satellite QoE Database. This database consists of 179 videos recorded from real-world streaming services affected by various authentic distortion patterns. We demonstrate the usefulness of this unique new resource by evaluating the efficacy of QoE-prediction models on it. We also created a new model that maps the network parameters to predicted human perception scores, which can be used by ISPs to optimize the video streaming quality of their networks.
arXiv Detail & Related papers (2024-10-17T18:22:50Z)
Subjective and Objective Quality-of-Experience Evaluation Study for Live Video Streaming [51.712182539961375]
We conduct a comprehensive study of subjective and objective QoE evaluations for live video streaming. For the subjective QoE study, we introduce the first live video streaming QoE dataset, TaoLive QoE. A human study was conducted to derive subjective QoE scores of videos in the TaoLive QoE dataset. We propose an end-to-end QoE evaluation model, Tao-QoE, which integrates multi-scale semantic features and optical flow-based motion features.
arXiv Detail & Related papers (2024-09-26T07:22:38Z)
QoS prediction in radio vehicular environments via prior user information [54.853542701389074]
We evaluate ML tree-ensemble methods to predict in the range of minutes with data collected from a cellular test network. Specifically, we use the correlations of the measurements coming from the radio environment by including information of prior vehicles to enhance the prediction of the target vehicles.
arXiv Detail & Related papers (2024-02-27T17:05:41Z)
Enhanced adaptive cross-layer scheme for low latency HEVC streaming over Vehicular Ad-hoc Networks (VANETs) [2.2124180701409233]
HEVC is very promising for real-time video streaming through Vehicular Ad-hoc Networks (VANET) A low complexity cross-layer mechanism is proposed to improve end-to-end performances of HEVC video streaming in VANET under low delay constraints. The proposed mechanism offers significant improvements regarding video quality at the reception and end-to-end delay compared to the Enhanced Distributed Channel Access (EDCA) adopted in the 802.11p.
arXiv Detail & Related papers (2023-11-05T14:19:38Z)
Adaptive QoS of WebRTC for Vehicular Media Communications [0.0]
Web Real-Time Communication (WebRTC) is a good candidate for media streaming across vehicles. This paper investigates a mechanism to adapt the video stream to the network capacity efficiently. The impact on end-to-end throughput and reaction time when applying different approaches to adaptation are analyzed in a real 5G testbed.
arXiv Detail & Related papers (2022-08-24T09:51:59Z)
Modeling Live Video Streaming: Real-Time Classification, QoE Inference, and Field Evaluation [1.4353812560047186]
ReCLive is a machine learning method for live video detection and QoE measurement based on network-level behavioral characteristics. We analyze about 23,000 video streams from Twitch and YouTube, and identify key features in their traffic profile that differentiate live and on-demand streaming. Our solution provides ISPs with fine-grained visibility into live video streams, enabling them to measure and improve user experience.
arXiv Detail & Related papers (2021-12-05T17:53:06Z)
Accelerating Real-Time Question Answering via Question Generation [98.43852668033595]
Ocean-Q introduces a new question generation (QG) model to generate a large pool of QA pairs offline. In real time matches an input question with the candidate QA pool to predict the answer without question encoding. Ocean-Q can be readily deployed in existing distributed database systems or search engine for large-scale query usage.
arXiv Detail & Related papers (2020-09-10T22:44:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.