Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities
- URL: http://arxiv.org/abs/2410.08534v2
- Date: Sun, 20 Oct 2024 00:00:39 GMT
- Title: Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities
- Authors: Abhijay Ghildyal, Yuanhan Chen, Saman Zadtootaghaj, Nabajeet Barman, Alan C. Bovik,
- Abstract summary: AI-generated and enhanced content must be visually accurate, adhere to intended use, and maintain high visual quality.
One way to monitor and control the visual "quality" of AI-generated and -enhanced content is by deploying Image Quality Assessment (IQA) and Video Quality Assessment (VQA) models.
This paper examines the current shortcomings and possibilities presented by AI-generated and enhanced image and video content.
- Score: 32.03360188710995
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The advent of AI has influenced many aspects of human life, from self-driving cars and intelligent chatbots to text-based image and video generation models capable of creating realistic images and videos based on user prompts (text-to-image, image-to-image, and image-to-video). AI-based methods for image and video super resolution, video frame interpolation, denoising, and compression have already gathered significant attention and interest in the industry and some solutions are already being implemented in real-world products and services. However, to achieve widespread integration and acceptance, AI-generated and enhanced content must be visually accurate, adhere to intended use, and maintain high visual quality to avoid degrading the end user's quality of experience (QoE). One way to monitor and control the visual "quality" of AI-generated and -enhanced content is by deploying Image Quality Assessment (IQA) and Video Quality Assessment (VQA) models. However, most existing IQA and VQA models measure visual fidelity in terms of "reconstruction" quality against a pristine reference content and were not designed to assess the quality of "generative" artifacts. To address this, newer metrics and models have recently been proposed, but their performance evaluation and overall efficacy have been limited by datasets that were too small or otherwise lack representative content and/or distortion capacity; and by performance measures that can accurately report the success of an IQA/VQA model for "GenAI". This paper examines the current shortcomings and possibilities presented by AI-generated and enhanced image and video content, with a particular focus on end-user perceived quality. Finally, we discuss open questions and make recommendations for future work on the "GenAI" quality assessment problems, towards further progressing on this interesting and relevant field of research.
Related papers
- AI-generated Image Quality Assessment in Visual Communication [72.11144790293086]
AIGI-VC is a quality assessment database for AI-generated images in visual communication.
The dataset consists of 2,500 images spanning 14 advertisement topics and 8 emotion types.
It provides coarse-grained human preference annotations and fine-grained preference descriptions, benchmarking the abilities of IQA methods in preference prediction, interpretation, and reasoning.
arXiv Detail & Related papers (2024-12-20T08:47:07Z) - Visual Verity in AI-Generated Imagery: Computational Metrics and Human-Centric Analysis [0.0]
We introduce and validated a questionnaire called Visual Verity, which measures photorealism, image quality, and text-image alignment.
We also analyzed statistical properties, finding that camera-generated images scored lower in hue, saturation, and brightness.
Our findings highlight the need for refining computational metrics to better capture human visual perception.
arXiv Detail & Related papers (2024-08-22T23:29:07Z) - Benchmarking Multi-dimensional AIGC Video Quality Assessment: A Dataset and Unified Model [56.03592388332793]
We investigate the AIGC-VQA problem, considering both subjective and objective quality assessment perspectives.
For the subjective perspective, we construct the Large-scale Generated Video Quality assessment (LGVQ) dataset, consisting of 2,808 AIGC videos.
We evaluate the perceptual quality of AIGC videos from three critical dimensions: spatial quality, temporal quality, and text-video alignment.
We propose the Unify Generated Video Quality assessment (UGVQ) model, designed to accurately evaluate the multi-dimensional quality of AIGC videos.
arXiv Detail & Related papers (2024-07-31T07:54:26Z) - Vision Language Modeling of Content, Distortion and Appearance for Image Quality Assessment [20.85110284579424]
Distilling high level knowledge about quality bearing attributes is crucial for developing objective Image Quality Assessment (IQA)
We present a new blind IQA (BIQA) model termed Self-supervision and Vision-Language supervision Image QUality Evaluator (SLIQUE)
SLIQUE features a joint vision-language and visual contrastive representation learning framework for acquiring high level knowledge about the images semantic contents, distortion characteristics and appearance properties for IQA.
arXiv Detail & Related papers (2024-06-14T09:18:28Z) - Quality Assessment for AI Generated Images with Instruction Tuning [58.41087653543607]
We first establish a novel Image Quality Assessment (IQA) database for AIGIs, termed AIGCIQA2023+.
This paper presents a MINT-IQA model to evaluate and explain human preferences for AIGIs from Multi-perspectives with INstruction Tuning.
arXiv Detail & Related papers (2024-05-12T17:45:11Z) - Large Multi-modality Model Assisted AI-Generated Image Quality Assessment [53.182136445844904]
We introduce a large Multi-modality model Assisted AI-Generated Image Quality Assessment (MA-AGIQA) model.
It uses semantically informed guidance to sense semantic information and extract semantic vectors through carefully designed text prompts.
It achieves state-of-the-art performance, and demonstrates its superior generalization capabilities on assessing the quality of AI-generated images.
arXiv Detail & Related papers (2024-04-27T02:40:36Z) - Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation
Evaluation [96.74302670358145]
We introduce an automated method for Visual Concept Evaluation (ViCE) to assess consistency between a generated/edited image and the corresponding prompt/instructions.
ViCE combines the strengths of Large Language Models (LLMs) and Visual Question Answering (VQA) into a unified pipeline, aiming to replicate the human cognitive process in quality assessment.
arXiv Detail & Related papers (2023-07-18T16:33:30Z) - Helping Visually Impaired People Take Better Quality Pictures [52.03016269364854]
We develop tools to help visually impaired users minimize occurrences of common technical distortions.
We also create a prototype feedback system that helps to guide users to mitigate quality issues.
arXiv Detail & Related papers (2023-05-14T04:37:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.