AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
- URL: http://arxiv.org/abs/2504.21308v1
- Date: Wed, 30 Apr 2025 04:36:56 GMT
- Title: AGHI-QA: A Subjective-Aligned Dataset and Metric for AI-Generated Human Images
- Authors: Yunhao Li, Sijing Wu, Wei Sun, Zhichao Zhang, Yucheng Zhu, Zicheng Zhang, Huiyu Duan, Xiongkuo Min, Guangtao Zhai,
- Abstract summary: We introduce AGHI-QA, the first large-scale benchmark specifically designed for quality assessment of human images (AGHIs)<n>The dataset comprises 4,000 images generated from 400 carefully crafted text prompts using 10 state-of-the-art T2I models.<n>We conduct a systematic subjective study to collect multidimensional annotations, including perceptual quality scores, text-image correspondence scores, visible and distorted body part labels.
- Score: 58.87047247313503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid development of text-to-image (T2I) generation approaches has attracted extensive interest in evaluating the quality of generated images, leading to the development of various quality assessment methods for general-purpose T2I outputs. However, existing image quality assessment (IQA) methods are limited to providing global quality scores, failing to deliver fine-grained perceptual evaluations for structurally complex subjects like humans, which is a critical challenge considering the frequent anatomical and textural distortions in AI-generated human images (AGHIs). To address this gap, we introduce AGHI-QA, the first large-scale benchmark specifically designed for quality assessment of AGHIs. The dataset comprises 4,000 images generated from 400 carefully crafted text prompts using 10 state of-the-art T2I models. We conduct a systematic subjective study to collect multidimensional annotations, including perceptual quality scores, text-image correspondence scores, visible and distorted body part labels. Based on AGHI-QA, we evaluate the strengths and weaknesses of current T2I methods in generating human images from multiple dimensions. Furthermore, we propose AGHI-Assessor, a novel quality metric that integrates the large multimodal model (LMM) with domain-specific human features for precise quality prediction and identification of visible and distorted body parts in AGHIs. Extensive experimental results demonstrate that AGHI-Assessor showcases state-of-the-art performance, significantly outperforming existing IQA methods in multidimensional quality assessment and surpassing leading LMMs in detecting structural distortions in AGHIs.
Related papers
- Towards Explainable Partial-AIGC Image Quality Assessment [51.42831861127991]
Despite extensive research on image quality assessment (IQA) for AI-generated images (AGIs), most studies focus on fully AI-generated outputs.<n>We construct the first large-scale PAI dataset towards explainable partial-AIGC image quality assessment (EPAIQA)<n>Our work represents a pioneering effort in the perceptual IQA field for comprehensive PAI quality assessment.
arXiv Detail & Related papers (2025-04-12T17:27:50Z) - D-Judge: How Far Are We? Evaluating the Discrepancies Between AI-synthesized Images and Natural Images through Multimodal Guidance [19.760989919485894]
We introduce an AI-Natural Image Discrepancy accessing benchmark (textitD-Judge)<n>We construct textitD-ANI, a dataset with 5,000 natural images and over 440,000 AIGIs generated by nine models using Text-to-Image (T2I), Image-to-Image (I2I), and Text and Image-to-Image (TI2I) prompts.<n>Our framework evaluates the discrepancy across five dimensions: naive image quality, semantic alignment, aesthetic appeal, downstream applicability, and human validation.
arXiv Detail & Related papers (2024-12-23T15:08:08Z) - AI-generated Image Quality Assessment in Visual Communication [72.11144790293086]
AIGI-VC is a quality assessment database for AI-generated images in visual communication.<n>The dataset consists of 2,500 images spanning 14 advertisement topics and 8 emotion types.<n>It provides coarse-grained human preference annotations and fine-grained preference descriptions, benchmarking the abilities of IQA methods in preference prediction, interpretation, and reasoning.
arXiv Detail & Related papers (2024-12-20T08:47:07Z) - Q-Ground: Image Quality Grounding with Large Multi-modality Models [61.72022069880346]
We introduce Q-Ground, the first framework aimed at tackling fine-scale visual quality grounding.
Q-Ground combines large multi-modality models with detailed visual quality analysis.
Central to our contribution is the introduction of the QGround-100K dataset.
arXiv Detail & Related papers (2024-07-24T06:42:46Z) - PKU-AIGIQA-4K: A Perceptual Quality Assessment Database for Both Text-to-Image and Image-to-Image AI-Generated Images [1.5265677582796984]
We establish a large scale perceptual quality assessment database for both text-to-image and image-to-image AIGIs, named PKU-AIGIQA-4K.
We propose three image quality assessment (IQA) methods based on pre-trained models that include a no-reference method NR-AIGCIQA, a full-reference method FR-AIGCIQA, and a partial-reference method PR-AIGCIQA.
arXiv Detail & Related papers (2024-04-29T03:57:43Z) - G-Refine: A General Quality Refiner for Text-to-Image Generation [74.16137826891827]
We introduce G-Refine, a general image quality refiner designed to enhance low-quality images without compromising integrity of high-quality ones.
The model is composed of three interconnected modules: a perception quality indicator, an alignment quality indicator, and a general quality enhancement module.
Extensive experimentation reveals that AIGIs after G-Refine outperform in 10+ quality metrics across 4 databases.
arXiv Detail & Related papers (2024-04-29T00:54:38Z) - AIGIQA-20K: A Large Database for AI-Generated Image Quality Assessment [54.93996119324928]
We create the largest AIGI subjective quality database to date with 20,000 AIGIs and 420,000 subjective ratings, known as AIGIQA-20K.
We conduct benchmark experiments on this database to assess the correspondence between 16 mainstream AIGI quality models and human perception.
arXiv Detail & Related papers (2024-04-04T12:12:24Z) - PKU-I2IQA: An Image-to-Image Quality Assessment Database for AI
Generated Images [1.6031185986328562]
We establish a human perception-based image-to-image AIGCIQA database, named PKU-I2IQA.
We propose two benchmark models: NR-AIGCIQA based on the no-reference image quality assessment method and FR-AIGCIQA based on the full-reference image quality assessment method.
arXiv Detail & Related papers (2023-11-27T05:53:03Z) - A Perceptual Quality Assessment Exploration for AIGC Images [39.72512063793346]
In this paper, we discuss the major evaluation aspects such as technical issues, AI artifacts, unnaturalness, discrepancy, and aesthetics for AGI quality assessment.
We present the first perceptual AGI quality assessment database, AGIQA-1K, which consists of 1,080 AGIs generated from diffusion models.
arXiv Detail & Related papers (2023-03-22T14:59:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.