Adaptive Score Alignment Learning for Continual Perceptual Quality Assessment of 360-Degree Videos in Virtual Reality
- URL: http://arxiv.org/abs/2502.19644v1
- Date: Thu, 27 Feb 2025 00:29:04 GMT
- Title: Adaptive Score Alignment Learning for Continual Perceptual Quality Assessment of 360-Degree Videos in Virtual Reality
- Authors: Kanglei Zhou, Zikai Hao, Liyuan Wang, Xiaohui Liang,
- Abstract summary: We propose a novel approach for assessing the perceptual quality of VR videos, Adaptive Score Alignment Learning (ASAL)<n>ASAL integrates correlation loss with error loss to enhance alignment with human subjective ratings and precision in predicting perceptual quality.<n>We establish a comprehensive benchmark for VR-VQA and its CL counterpart, introducing new data splits and evaluation metrics.
- Score: 20.511561848185444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Virtual Reality Video Quality Assessment (VR-VQA) aims to evaluate the perceptual quality of 360-degree videos, which is crucial for ensuring a distortion-free user experience. Traditional VR-VQA methods trained on static datasets with limited distortion diversity struggle to balance correlation and precision. This becomes particularly critical when generalizing to diverse VR content and continually adapting to dynamic and evolving video distribution variations. To address these challenges, we propose a novel approach for assessing the perceptual quality of VR videos, Adaptive Score Alignment Learning (ASAL). ASAL integrates correlation loss with error loss to enhance alignment with human subjective ratings and precision in predicting perceptual quality. In particular, ASAL can naturally adapt to continually changing distributions through a feature space smoothing process that enhances generalization to unseen content. To further improve continual adaptation to dynamic VR environments, we extend ASAL with adaptive memory replay as a novel Continul Learning (CL) framework. Unlike traditional CL models, ASAL utilizes key frame extraction and feature adaptation to address the unique challenges of non-stationary variations with both the computation and storage restrictions of VR devices. We establish a comprehensive benchmark for VR-VQA and its CL counterpart, introducing new data splits and evaluation metrics. Our experiments demonstrate that ASAL outperforms recent strong baseline models, achieving overall correlation gains of up to 4.78\% in the static joint training setting and 12.19\% in the dynamic CL setting on various datasets. This validates the effectiveness of ASAL in addressing the inherent challenges of VR-VQA.Our code is available at https://github.com/ZhouKanglei/ASAL_CVQA.
Related papers
- DiVR: incorporating context from diverse VR scenes for human trajectory prediction [2.16656895298847]
We propose Diverse Context VR Human Motion Prediction (DiVR), a cross-modal transformer based on the Perceiver architecture.
Results show that DiVR achieves higher accuracy and adaptability compared to other models and to static graphs.
Our source code is publicly available at https://gitlab.inria.fr/ffrancog/creattive3d-divr-model.
arXiv Detail & Related papers (2024-11-13T07:55:41Z) - Addressing Data Heterogeneity in Federated Learning with Adaptive Normalization-Free Feature Recalibration [1.33512912917221]
Federated learning is a decentralized collaborative training paradigm that preserves stakeholders' data ownership while improving performance and generalization.
We propose Adaptive Normalization-free Feature Recalibration (ANFR), an architecture-level approach that combines weight standardization and channel attention.
arXiv Detail & Related papers (2024-10-02T20:16:56Z) - Sensitivity-Informed Augmentation for Robust Segmentation [21.609070498399863]
Internal noises such as variations in camera quality or lens distortion can affect the performance of segmentation models.
We present an efficient, adaptable, and gradient-free method to enhance the robustness of learning-based segmentation models across training.
arXiv Detail & Related papers (2024-06-03T15:25:45Z) - Self-Avatar Animation in Virtual Reality: Impact of Motion Signals Artifacts on the Full-Body Pose Reconstruction [13.422686350235615]
We aim to measure the impact on the reconstruction of the articulated self-avatar's full-body pose.
We analyze the motion reconstruction errors using ground truth and 3D Cartesian coordinates estimated from textitYOLOv8 pose estimation.
arXiv Detail & Related papers (2024-04-29T12:02:06Z) - Learning Objective-Specific Active Learning Strategies with Attentive
Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting.
We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem.
Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Perceptual Quality Assessment of Virtual Reality Videos in the Wild [53.94620993606658]
Existing panoramic video databases only consider synthetic distortions, assume fixed viewing conditions, and are limited in size.
We construct the VR Video Quality in the Wild (VRVQW) database, containing $502$ user-generated videos with diverse content and distortion characteristics.
We conduct a formal psychophysical experiment to record the scanpaths and perceived quality scores from $139$ participants under two different viewing conditions.
arXiv Detail & Related papers (2022-06-13T02:22:57Z) - Towards Scale Consistent Monocular Visual Odometry by Learning from the
Virtual World [83.36195426897768]
We propose VRVO, a novel framework for retrieving the absolute scale from virtual data.
We first train a scale-aware disparity network using both monocular real images and stereo virtual data.
The resulting scale-consistent disparities are then integrated with a direct VO system.
arXiv Detail & Related papers (2022-03-11T01:51:54Z) - Feeling of Presence Maximization: mmWave-Enabled Virtual Reality Meets
Deep Reinforcement Learning [76.46530937296066]
This paper investigates the problem of providing ultra-reliable and energy-efficient virtual reality (VR) experiences for wireless mobile users.
To ensure reliable ultra-high-definition (UHD) video frame delivery to mobile users, a coordinated multipoint (CoMP) transmission technique and millimeter wave (mmWave) communications are exploited.
arXiv Detail & Related papers (2021-06-03T08:35:10Z) - Meta-Reinforcement Learning for Reliable Communication in THz/VLC
Wireless VR Networks [157.42035777757292]
The problem of enhancing the quality of virtual reality (VR) services is studied for an indoor terahertz (THz)/visible light communication (VLC) wireless network.
Small base stations (SBSs) transmit high-quality VR images to VR users over THz bands and light-emitting diodes (LEDs) provide accurate indoor positioning services.
To control the energy consumption of the studied THz/VLC wireless VR network, VLC access points (VAPs) must be selectively turned on.
arXiv Detail & Related papers (2021-01-29T15:57:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.