Full Reference Video Quality Assessment for Machine Learning-Based Video
Codecs
- URL: http://arxiv.org/abs/2309.00769v1
- Date: Sat, 2 Sep 2023 00:32:26 GMT
- Title: Full Reference Video Quality Assessment for Machine Learning-Based Video
Codecs
- Authors: Abrar Majeedi, Babak Naderi, Yasaman Hosseinkashi, Juhee Cho, Ruben
Alvarez Martinez, Ross Cutler
- Abstract summary: We show that existing evaluation metrics that were designed and trained on DSP-based video codecs are not highly correlated to subjective opinion when used with ML video codecs.
We propose a new full reference video quality assessment (FRVQA) model that achieves a Pearson Correlation Coefficient (PCC) of 0.99 and a Spearman's Rank Correlation Coefficient (SRCC) of 0.99 at the model level.
- Score: 12.024300171633758
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning-based video codecs have made significant progress in the
past few years. A critical area in the development of ML-based video codecs is
an accurate evaluation metric that does not require an expensive and slow
subjective test. We show that existing evaluation metrics that were designed
and trained on DSP-based video codecs are not highly correlated to subjective
opinion when used with ML video codecs due to the video artifacts being quite
different between ML and video codecs. We provide a new dataset of ML video
codec videos that have been accurately labeled for quality. We also propose a
new full reference video quality assessment (FRVQA) model that achieves a
Pearson Correlation Coefficient (PCC) of 0.99 and a Spearman's Rank Correlation
Coefficient (SRCC) of 0.99 at the model level. We make the dataset and FRVQA
model open source to help accelerate research in ML video codecs, and so that
others can further improve the FRVQA model.
Related papers
- MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding [67.56182262082729]
We introduce MMBench-Video, a quantitative benchmark to rigorously evaluate large vision-language models (LVLMs) in video understanding.
MMBench-Video incorporates lengthy videos from YouTube and employs free-form questions, mirroring practical use cases.
The benchmark is meticulously crafted to probe the models' temporal reasoning skills, with all questions human-annotated according to a carefully constructed ability taxonomy.
arXiv Detail & Related papers (2024-06-20T17:26:01Z) - Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs [20.168429351519055]
Video understanding is a crucial next step for multimodal large language models (LMLMs)
We propose VideoNIAH (Video Needle In A Haystack), a benchmark construction framework through synthetic video generation.
We conduct a comprehensive evaluation of both proprietary and open-source models, uncovering significant differences in their video understanding capabilities.
arXiv Detail & Related papers (2024-06-13T17:50:05Z) - Knowledge Guided Semi-Supervised Learning for Quality Assessment of User
Generated Videos [9.681456357957819]
We design a self-supervised framework to generate robust quality aware features for videos.
We then propose a dual-model based Semi Supervised Learning (SSL) method specifically designed for the Video Quality Assessment task.
Our SSL-VQA method uses the ST-VQRL backbone to produce robust performances across various VQA datasets.
Our model improves the state-of-the-art performance when trained only with limited data by around 10%, and by around 15% when unlabelled data is also used in SSL.
arXiv Detail & Related papers (2023-12-24T07:32:03Z) - LSTM-based Video Quality Prediction Accounting for Temporal Distortions
in Videoconferencing Calls [22.579711841384764]
We present a data-driven approach for modeling such distortions automatically by training an LSTM with subjective quality ratings labeled via crowdsourcing.
We applied QR codes as markers on the source videos to create aligned references and compute temporal features based on the alignment vectors.
Our proposed model achieves a PCC of 0.99 on the validation set and gives detailed insight into the cause of video quality impairments.
arXiv Detail & Related papers (2023-03-22T17:14:38Z) - Video compression dataset and benchmark of learning-based video-quality
metrics [55.41644538483948]
We present a new benchmark for video-quality metrics that evaluates video compression.
It is based on a new dataset consisting of about 2,500 streams encoded using different standards.
Subjective scores were collected using crowdsourced pairwise comparisons.
arXiv Detail & Related papers (2022-11-22T09:22:28Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Objective video quality metrics application to video codecs comparisons:
choosing the best for subjective quality estimation [101.18253437732933]
Quality assessment plays a key role in creating and comparing video compression algorithms.
For comparison, we used a set of videos encoded with video codecs of different standards, and visual quality scores collected for the resulting set of streams since 2018 until 2021.
arXiv Detail & Related papers (2021-07-21T17:18:11Z) - ELF-VC: Efficient Learned Flexible-Rate Video Coding [61.10102916737163]
We propose several novel ideas for learned video compression which allow for improved performance for the low-latency mode.
We benchmark our method, which we call ELF-VC, on popular video test sets UVG and MCL-JCV.
Our approach runs at least 5x faster and has fewer parameters than all ML codecs which report these figures.
arXiv Detail & Related papers (2021-04-29T17:50:35Z) - UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated
Content [59.13821614689478]
Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of content are unpredictable, complicated, and often commingled.
Here we contribute to advancing the problem by conducting a comprehensive evaluation of leading VQA models.
By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models.
Our experimental results show that VIDEVAL achieves state-of-theart performance at considerably lower computational cost than other leading models.
arXiv Detail & Related papers (2020-05-29T00:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.