Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated Images
- URL: http://arxiv.org/abs/2405.09426v2
- Date: Thu, 16 May 2024 17:01:48 GMT
- Title: Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated Images
- Authors: Memoona Aziz, Umair Rehman, Muhammad Umair Danish, Katarina Grolinger,
- Abstract summary: The Global-Local Image Perceptual Score (GLIPS) is an image metric designed to assess the photorealistic image quality of AI-generated images.
Comprehensive tests across various generative models demonstrate that GLIPS consistently outperforms existing metrics like FID, SSIM, and MS-SSIM in terms of correlation with human scores.
- Score: 0.7499722271664147
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces the Global-Local Image Perceptual Score (GLIPS), an image metric designed to assess the photorealistic image quality of AI-generated images with a high degree of alignment to human visual perception. Traditional metrics such as FID and KID scores do not align closely with human evaluations. The proposed metric incorporates advanced transformer-based attention mechanisms to assess local similarity and Maximum Mean Discrepancy (MMD) to evaluate global distributional similarity. To evaluate the performance of GLIPS, we conducted a human study on photorealistic image quality. Comprehensive tests across various generative models demonstrate that GLIPS consistently outperforms existing metrics like FID, SSIM, and MS-SSIM in terms of correlation with human scores. Additionally, we introduce the Interpolative Binning Scale (IBS), a refined scaling method that enhances the interpretability of metric scores by aligning them more closely with human evaluative standards. The proposed metric and scaling approach not only provides more reliable assessments of AI-generated images but also suggest pathways for future enhancements in image generation technologies.
Related papers
- Visual Verity in AI-Generated Imagery: Computational Metrics and Human-Centric Analysis [0.0]
We introduce and validated a questionnaire called Visual Verity, which measures photorealism, image quality, and text-image alignment.
We also analyzed statistical properties, finding that camera-generated images scored lower in hue, saturation, and brightness.
Our findings highlight the need for refining computational metrics to better capture human visual perception.
arXiv Detail & Related papers (2024-08-22T23:29:07Z) - DiffuSyn Bench: Evaluating Vision-Language Models on Real-World Complexities with Diffusion-Generated Synthetic Benchmarks [0.0]
This study assesses the ability of Large Vision-Language Models (LVLMs) to differentiate between AI-generated and human-generated images.
It introduces a new automated benchmark construction method for this evaluation.
arXiv Detail & Related papers (2024-06-06T19:50:33Z) - RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection [60.960988614701414]
RIGID is a training-free and model-agnostic method for robust AI-generated image detection.
RIGID significantly outperforms existing trainingbased and training-free detectors.
arXiv Detail & Related papers (2024-05-30T14:49:54Z) - Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics [54.08757792080732]
We propose integrating deep features from pre-trained visual models with a statistical analysis model to achieve opinion-unaware BIQA (OU-BIQA)
Our proposed model exhibits superior consistency with human visual perception compared to state-of-the-art BIQA models.
arXiv Detail & Related papers (2024-05-29T06:09:34Z) - GMC-IQA: Exploiting Global-correlation and Mean-opinion Consistency for
No-reference Image Quality Assessment [40.33163764161929]
We construct a novel loss function and network to exploit Global-correlation and Mean-opinion Consistency.
We propose a novel GCC loss by defining a pairwise preference-based rank estimation to solve the non-differentiable problem of SROCC.
We also propose a mean-opinion network, which integrates diverse opinion features to alleviate the randomness of weight learning.
arXiv Detail & Related papers (2024-01-19T06:03:01Z) - Metrics to Quantify Global Consistency in Synthetic Medical Images [6.863780677964219]
We introduce two metrics that can measure the global consistency of synthetic images on a per-image basis.
We quantify global consistency by predicting and comparing explicit attributes of images on patches using supervised trained neural networks.
Our results demonstrate that predicting explicit attributes of synthetic images on patches can distinguish globally consistent from inconsistent images.
arXiv Detail & Related papers (2023-08-01T09:29:39Z) - MAUVE Scores for Generative Models: Theory and Practice [95.86006777961182]
We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images.
We find that MAUVE can quantify the gaps between the distributions of human-written text and those of modern neural language models.
We demonstrate in the vision domain that MAUVE can identify known properties of generated images on par with or better than existing metrics.
arXiv Detail & Related papers (2022-12-30T07:37:40Z) - Evaluating the Quality and Diversity of DCGAN-based Generatively
Synthesized Diabetic Retinopathy Imagery [0.07499722271664144]
Publicly available diabetic retinopathy (DR) datasets are imbalanced, containing limited numbers of images with DR.
The imbalance can be addressed using Geneversarative Adrial Networks (GANs) to augment the datasets with synthetic images.
To evaluate the quality and diversity of synthetic images, several evaluation metrics, such as Multi-Scale Structural Similarity Index (MS-SSIM), Cosine Distance (CD), and Fr't Inception Distance (FID) are used.
arXiv Detail & Related papers (2022-08-10T23:50:01Z) - Conformer and Blind Noisy Students for Improved Image Quality Assessment [80.57006406834466]
Learning-based approaches for perceptual image quality assessment (IQA) usually require both the distorted and reference image for measuring the perceptual quality accurately.
In this work, we explore the performance of transformer-based full-reference IQA models.
We also propose a method for IQA based on semi-supervised knowledge distillation from full-reference teacher models into blind student models.
arXiv Detail & Related papers (2022-04-27T10:21:08Z) - Generalized Visual Quality Assessment of GAN-Generated Face Images [79.47386781978531]
We study the subjective and objective quality towards generalized quality assessment of GAN-generated face images (GFIs)
We develop a quality assessment model that is able to deliver accurate quality predictions for GFIs from both available and unseen GAN algorithms.
arXiv Detail & Related papers (2022-01-28T07:54:49Z) - Transparent Human Evaluation for Image Captioning [70.03979566548823]
We develop a rubric-based human evaluation protocol for image captioning models.
We show that human-generated captions show substantially higher quality than machine-generated ones.
We hope that this work will promote a more transparent evaluation protocol for image captioning.
arXiv Detail & Related papers (2021-11-17T07:09:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.