Quality Assessment in the Era of Large Models: A Survey
- URL: http://arxiv.org/abs/2409.00031v1
- Date: Sat, 17 Aug 2024 13:20:55 GMT
- Title: Quality Assessment in the Era of Large Models: A Survey
- Authors: Zicheng Zhang, Yingjie Zhou, Chunyi Li, Baixuan Zhao, Xiaohong Liu, Guangtao Zhai,
- Abstract summary: Before the advent of large models, quality assessment typically relied on small expert models tailored for specific tasks.
With the advancement of large models, many researchers are now leveraging the prior knowledge embedded in these large models for quality assessment tasks.
- Score: 38.07282281398898
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quality assessment, which evaluates the visual quality level of multimedia experiences, has garnered significant attention from researchers and has evolved substantially through dedicated efforts. Before the advent of large models, quality assessment typically relied on small expert models tailored for specific tasks. While these smaller models are effective at handling their designated tasks and predicting quality levels, they often lack explainability and robustness. With the advancement of large models, which align more closely with human cognitive and perceptual processes, many researchers are now leveraging the prior knowledge embedded in these large models for quality assessment tasks. This emergence of quality assessment within the context of large models motivates us to provide a comprehensive review focusing on two key aspects: 1) the assessment of large models, and 2) the role of large models in assessment tasks. We begin by reflecting on the historical development of quality assessment. Subsequently, we move to detailed discussions of related works concerning quality assessment in the era of large models. Finally, we offer insights into the future progression and potential pathways for quality assessment in this new era. We hope this survey will enable a rapid understanding of the development of quality assessment in the era of large models and inspire further advancements in the field.
Related papers
- Q-Ground: Image Quality Grounding with Large Multi-modality Models [61.72022069880346]
We introduce Q-Ground, the first framework aimed at tackling fine-scale visual quality grounding.
Q-Ground combines large multi-modality models with detailed visual quality analysis.
Central to our contribution is the introduction of the QGround-100K dataset.
arXiv Detail & Related papers (2024-07-24T06:42:46Z) - Benchmarking the Attribution Quality of Vision Models [13.255247017616687]
We propose a novel evaluation protocol that overcomes two fundamental limitations of the widely used incremental-deletion protocol.
This allows us to evaluate 23 attribution methods and how different design choices of popular vision backbones affect their attribution quality.
We find that intrinsically explainable models outperform standard models and that raw attribution values exhibit a higher attribution quality than what is known from previous work.
arXiv Detail & Related papers (2024-07-16T17:02:20Z) - ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models [53.00812898384698]
We argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking.
We highlight how cognitive biases can conflate fluent information and truthfulness, and how cognitive uncertainty affects the reliability of rating scores such as Likert.
We propose the ConSiDERS-The-Human evaluation framework consisting of 6 pillars -- Consistency, Scoring Criteria, Differentiating, User Experience, Responsible, and Scalability.
arXiv Detail & Related papers (2024-05-28T22:45:28Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Evaluating the Generation Capabilities of Large Chinese Language Models [27.598864484231477]
This paper unveils CG-Eval, the first-ever comprehensive and automated evaluation framework.
It assesses the generative capabilities of large Chinese language models across a spectrum of academic disciplines.
Gscore automates the quality measurement of a model's text generation against reference standards.
arXiv Detail & Related papers (2023-08-09T09:22:56Z) - GMValuator: Similarity-based Data Valuation for Generative Models [41.76259565672285]
We introduce Generative Model Valuator (GMValuator), the first training-free and model-agnostic approach to provide data valuation for generation tasks.
GMValuator is extensively evaluated on various datasets and generative architectures to demonstrate its effectiveness.
arXiv Detail & Related papers (2023-04-21T02:02:02Z) - A Survey of Video-based Action Quality Assessment [6.654914040895586]
Action quality assessment model can reduce the human and material resources spent in action evaluation and reduce subjectivity.
Most of the existing work focuses on sports and medical care.
arXiv Detail & Related papers (2022-04-20T07:29:00Z) - Image Quality Assessment in the Modern Age [53.19271326110551]
This tutorial provides the audience with the basic theories, methodologies, and current progresses of image quality assessment (IQA)
We will first revisit several subjective quality assessment methodologies, with emphasis on how to properly select visual stimuli.
Both hand-engineered and (deep) learning-based methods will be covered.
arXiv Detail & Related papers (2021-10-19T02:38:46Z) - Pros and Cons of GAN Evaluation Measures: New Developments [53.10151901863263]
This work is an update of a previous paper on the same topic published a few years ago.
I describe new dimensions that are becoming important in assessing models, and discuss the connection between GAN evaluation and deepfakes.
arXiv Detail & Related papers (2021-03-17T01:48:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.