Multi-Branch Collaborative Learning Network for Video Quality Assessment in Industrial Video Search
- URL: http://arxiv.org/abs/2502.05924v1
- Date: Sun, 09 Feb 2025 14:57:25 GMT
- Title: Multi-Branch Collaborative Learning Network for Video Quality Assessment in Industrial Video Search
- Authors: Hengzhu Tang, Zefeng Zhang, Zhiping Li, Zhenyu Zhang, Xing Wu, Li Gao, Suqi Cheng, Dawei Yin,
- Abstract summary: In industrial systems, low-quality video characteristics fall into four categories.
These low-quality videos have been largely overlooked in academic research.
We introduce the Multi-Branch Collaborative Network (MBCN) tailored for industrial video retrieval systems.
- Score: 27.0139421302102
- License:
- Abstract: Video Quality Assessment (VQA) is vital for large-scale video retrieval systems, aimed at identifying quality issues to prioritize high-quality videos. In industrial systems, low-quality video characteristics fall into four categories: visual-related issues like mosaics and black boxes, textual issues from video titles and OCR content, and semantic issues like frame incoherence and frame-text mismatch from AI-generated videos. Despite their prevalence in industrial settings, these low-quality videos have been largely overlooked in academic research, posing a challenge for accurate identification. To address this, we introduce the Multi-Branch Collaborative Network (MBCN) tailored for industrial video retrieval systems. MBCN features four branches, each designed to tackle one of the aforementioned quality issues. After each branch independently scores videos, we aggregate these scores using a weighted approach and a squeeze-and-excitation mechanism to dynamically address quality issues across different scenarios. We implement point-wise and pair-wise optimization objectives to ensure score stability and reasonableness. Extensive offline and online experiments on a world-level video search engine demonstrate MBCN's effectiveness in identifying video quality issues, significantly enhancing the retrieval system's ranking performance. Detailed experimental analyses confirm the positive contribution of all four evaluation branches. Furthermore, MBCN significantly improves recognition accuracy for low-quality AI-generated videos compared to the baseline.
Related papers
- VQA$^2$: Visual Question Answering for Video Quality Assessment [76.81110038738699]
Video Quality Assessment (VQA) is a classic field in low-level visual perception.
Recent studies in the image domain have demonstrated that Visual Question Answering (VQA) can enhance markedly low-level visual quality evaluation.
We introduce the VQA2 Instruction dataset - the first visual question answering instruction dataset that focuses on video quality assessment.
The VQA2 series models interleave visual and motion tokens to enhance the perception of spatial-temporal quality details in videos.
arXiv Detail & Related papers (2024-11-06T09:39:52Z) - Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs [76.15356325947731]
We introduce Q-Bench-Video, a new benchmark specifically designed to evaluate LMMs' proficiency in discerning video quality.
We collect a total of 2,378 question-answer pairs and test them on 12 open-source & 5 proprietary LMMs.
Our findings indicate that while LMMs have a foundational understanding of video quality, their performance remains incomplete and imprecise, with a notable discrepancy compared to human performance.
arXiv Detail & Related papers (2024-09-30T08:05:00Z) - LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models [53.64461404882853]
Video quality assessment (VQA) algorithms are needed to monitor and optimize the quality of streaming videos.
Here, we propose the first Large Multi-Modal Video Quality Assessment (LMM-VQA) model, which introduces a novel visual modeling strategy for quality-aware feature extraction.
arXiv Detail & Related papers (2024-08-26T04:29:52Z) - Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap [4.922783970210658]
We categorize the assessment of AIGC video quality into three dimensions: visual harmony, video-text consistency, and domain distribution gap.
For each dimension, we design specific modules to provide a comprehensive quality assessment of AIGC videos.
Our research identifies significant variations in visual quality, fluidity, and style among videos generated by different text-to-video models.
arXiv Detail & Related papers (2024-04-21T08:27:20Z) - Perceptual Video Quality Assessment: A Survey [63.61214597655413]
Perceptual video quality assessment plays a vital role in the field of video processing.
Various subjective and objective video quality assessment studies have been conducted over the past two decades.
This survey provides an up-to-date and comprehensive review of these video quality assessment studies.
arXiv Detail & Related papers (2024-02-05T16:13:52Z) - Towards Explainable In-the-Wild Video Quality Assessment: A Database and
a Language-Prompted Approach [52.07084862209754]
We collect over two million opinions on 4,543 in-the-wild videos on 13 dimensions of quality-related factors.
Specifically, we ask the subjects to label among a positive, a negative, and a neutral choice for each dimension.
These explanation-level opinions allow us to measure the relationships between specific quality factors and abstract subjective quality ratings.
arXiv Detail & Related papers (2023-05-22T05:20:23Z) - Towards Robust Text-Prompted Semantic Criterion for In-the-Wild Video
Quality Assessment [54.31355080688127]
We introduce a text-prompted Semantic Affinity Quality Index (SAQI) and its localized version (SAQI-Local) using Contrastive Language-Image Pre-training (CLIP)
BVQI-Local demonstrates unprecedented performance, surpassing existing zero-shot indices by at least 24% on all datasets.
We conduct comprehensive analyses to investigate different quality concerns of distinct indices, demonstrating the effectiveness and rationality of our design.
arXiv Detail & Related papers (2023-04-28T08:06:05Z) - Deep Quality Assessment of Compressed Videos: A Subjective and Objective
Study [23.3509109592315]
In the video coding process, the perceived quality of a compressed video is evaluated by full-reference quality evaluation metrics.
To solve this problem, it is critical to design no-reference compressed video quality assessment algorithms.
In this work, a semi-automatic labeling method is adopted to build a large-scale compressed video quality database.
arXiv Detail & Related papers (2022-05-07T10:50:06Z) - Blindly Assess Quality of In-the-Wild Videos via Quality-aware
Pre-training and Motion Perception [32.87570883484805]
We propose to transfer knowledge from image quality assessment (IQA) databases with authentic distortions and large-scale action recognition with rich motion patterns.
We train the proposed model on the target VQA databases using a mixed list-wise ranking loss function.
arXiv Detail & Related papers (2021-08-19T05:29:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.