NovisVQ: A Streaming Convolutional Neural Network for No-Reference Opinion-Unaware Frame Quality Assessment
- URL: http://arxiv.org/abs/2511.04628v1
- Date: Thu, 06 Nov 2025 18:23:55 GMT
- Title: NovisVQ: A Streaming Convolutional Neural Network for No-Reference Opinion-Unaware Frame Quality Assessment
- Authors: Kylie Cancilla, Alexander Moore, Amar Saini, Carmen Carrano,
- Abstract summary: Video quality assessment (VQA) is vital for computer vision tasks, but existing approaches face major limitations.<n>We present a scalable, streaming-based VQA model that is both no-reference and opinion-unaware.
- Score: 39.76658525158528
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Video quality assessment (VQA) is vital for computer vision tasks, but existing approaches face major limitations: full-reference (FR) metrics require clean reference videos, and most no-reference (NR) models depend on training on costly human opinion labels. Moreover, most opinion-unaware NR methods are image-based, ignoring temporal context critical for video object detection. In this work, we present a scalable, streaming-based VQA model that is both no-reference and opinion-unaware. Our model leverages synthetic degradations of the DAVIS dataset, training a temporal-aware convolutional architecture to predict FR metrics (LPIPS , PSNR, SSIM) directly from degraded video, without references at inference. We show that our streaming approach outperforms our own image-based baseline by generalizing across diverse degradations, underscoring the value of temporal modeling for scalable VQA in real-world vision systems. Additionally, we demonstrate that our model achieves higher correlation with full-reference metrics compared to BRISQUE, a widely-used opinion-aware image quality assessment baseline, validating the effectiveness of our temporal, opinion-unaware approach.
Related papers
- Q-Save: Towards Scoring and Attribution for Generated Video Evaluation [65.83319736145869]
We present Q-Save, a new benchmark dataset and model for holistic evaluation of AI-generated video (AIGV) quality.<n>The dataset contains near 10000 videos, each annotated with a scalar mean opinion score (MOS) and fine-grained attribution labels.<n>We propose a unified evaluation model that jointly performs quality scoring and attribution-based explanation.
arXiv Detail & Related papers (2025-11-24T07:00:21Z) - CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video [9.172799792564009]
We propose CAMP-VQA, a novel NR-VQA framework that exploits the semantic understanding capabilities of large models.<n>Our approach introduces a quality-aware video metadata mechanism that integrates key fragments extracted from inter-frame variations.<n>Our model consistently outperforms existing NR-VQA methods, achieving improved accuracy without the need for costly manual fine-grained annotations.
arXiv Detail & Related papers (2025-11-10T16:37:47Z) - VQAThinker: Exploring Generalizable and Explainable Video Quality Assessment via Reinforcement Learning [50.34205095371895]
Video quality assessment aims to objectively quantify perceptual quality degradation.<n>Existing VQA models suffer from two critical limitations.<n>We propose textbfVQAThinker, a reasoning-based VQA framework.
arXiv Detail & Related papers (2025-08-08T06:16:23Z) - Towards Generalized Video Quality Assessment: A Weak-to-Strong Learning Paradigm [76.63001244080313]
Video quality assessment (VQA) seeks to predict the perceptual quality of a video in alignment with human visual perception.<n>The dominant VQA paradigm relies on supervised training with human-labeled datasets.<n>We explore weak-to-strong (W2S) learning as a new paradigm for advancing VQA without reliance on large-scale human-labeled datasets.
arXiv Detail & Related papers (2025-05-06T15:29:32Z) - DVLTA-VQA: Decoupled Vision-Language Modeling with Text-Guided Adaptation for Blind Video Quality Assessment [17.85550556489256]
This paper propose a Decoupled Vision-Language Modeling with Text-Guided Adaptation for Blind Video Quality Assessment (DVLTA-VQA)<n>A Video-Based Temporal CLIP module is proposed to explicitly model temporal dynamics and enhance motion perception, aligning with the dorsal stream.<n>A Temporal Context Module is developed to refine inter-frame dependencies, further improving motion modeling.<n>Finally, a text-guided adaptive fusion strategy is proposed to enable more effective integration of spatial and temporal information.
arXiv Detail & Related papers (2025-04-16T03:20:28Z) - Q-Insight: Understanding Image Quality via Visual Reinforcement Learning [27.26829134776367]
Image quality assessment (IQA) focuses on the perceptual visual quality of images, playing a crucial role in downstream tasks such as image reconstruction, compression, and generation.<n>We propose Q-Insight, a reinforcement learning-based model built upon group relative policy optimization (GRPO)<n>We show that Q-Insight substantially outperforms existing state-of-the-art methods in both score regression and degradation perception tasks.
arXiv Detail & Related papers (2025-03-28T17:59:54Z) - Exploring Opinion-unaware Video Quality Assessment with Semantic
Affinity Criterion [52.07084862209754]
We introduce an explicit semantic affinity index for opinion-unaware VQA using text-prompts in the contrastive language-image pre-training model.
We also aggregate it with different traditional low-level naturalness indexes through gaussian normalization and sigmoid rescaling strategies.
The proposed Blind Unified Opinion-Unaware Video Quality Index via Semantic and Technical Metric Aggregation (BUONA-VISTA) outperforms existing opinion-unaware VQA methods by at least 20% improvements.
arXiv Detail & Related papers (2023-02-26T08:46:07Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - A Brief Survey on Adaptive Video Streaming Quality Assessment [30.253712568568876]
Quality of experience (QoE) assessment for adaptive video streaming plays a significant role in advanced network management systems.
We analyze and compare different variations of objective QoE assessment models with or without using machine learning techniques for adaptive video streaming.
We find that existing video streaming QoE assessment models still have limited performance, which makes it difficult to be applied in practical communication systems.
arXiv Detail & Related papers (2022-02-25T21:38:14Z) - Learning Transformer Features for Image Quality Assessment [53.51379676690971]
We propose a unified IQA framework that utilizes CNN backbone and transformer encoder to extract features.
The proposed framework is compatible with both FR and NR modes and allows for a joint training scheme.
arXiv Detail & Related papers (2021-12-01T13:23:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.