Related papers: Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System

Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System

URL: http://arxiv.org/abs/2402.17204v3
Date: Fri, 1 Mar 2024 21:02:29 GMT
Title: Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System
Authors: Majid Memari, Khaled R. Ahmed, Shahram Rahimi, Noorbakhsh Amiri Golilarz
Abstract summary: This research addresses a critical challenge in the field of generative models, particularly in the generation and evaluation of synthetic images. We introduce a pioneering algorithm to objectively assess the realism of synthetic images. Our algorithm is particularly tailored to address the challenges in generating and evaluating realistic images of Arabic handwritten digits.
Score: 1.2289361708127877
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This research addresses a critical challenge in the field of generative models, particularly in the generation and evaluation of synthetic images. Given the inherent complexity of generative models and the absence of a standardized procedure for their comparison, our study introduces a pioneering algorithm to objectively assess the realism of synthetic images. This approach significantly enhances the evaluation methodology by refining the Fr\'echet Inception Distance (FID) score, allowing for a more precise and subjective assessment of image quality. Our algorithm is particularly tailored to address the challenges in generating and evaluating realistic images of Arabic handwritten digits, a task that has traditionally been near-impossible due to the subjective nature of realism in image generation. By providing a systematic and objective framework, our method not only enables the comparison of different generative models but also paves the way for improvements in their design and output. This breakthrough in evaluation and comparison is crucial for advancing the field of OCR, especially for scripts that present unique complexities, and sets a new standard in the generation and assessment of high-quality synthetic images.

Related papers

Scene Perceived Image Perceptual Score (SPIPS): combining global and local perception for image quality assessment [0.0]
We propose a novel IQA approach that bridges the gap between deep learning methods and human perception. Our model disentangles deep features into high-level semantic information and low-level perceptual details, treating each stream separately. This hybrid design enables the model to assess both global context and intricate image details, better reflecting the human visual process.
arXiv Detail & Related papers (2025-04-24T04:06:07Z)
A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on. Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions. The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z)
KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities [93.74881034001312]
We conduct a systematic study on the fidelity of entities in text-to-image generation models. We focus on their ability to generate a wide range of real-world visual entities, such as landmark buildings, aircraft, plants, and animals. Our findings reveal that even the most advanced text-to-image models often fail to generate entities with accurate visual details.
arXiv Detail & Related papers (2024-10-15T17:50:37Z)
PixLens: A Novel Framework for Disentangled Evaluation in Diffusion-Based Image Editing with Object Detection + SAM [17.89238060470998]
evaluating diffusion-based image-editing models is a crucial task in the field of Generative AI. Our benchmark, PixLens, provides a comprehensive evaluation of both edit quality and latent representation disentanglement.
arXiv Detail & Related papers (2024-10-08T06:05:15Z)
A Survey on Quality Metrics for Text-to-Image Models [9.753473063305503]
We provide an overview of existing text-to-image quality metrics addressing their nuances and the need for alignment with human preferences. We propose a new taxonomy for categorizing these metrics, which is grounded in the assumption that there are two main quality criteria, namely compositionality and generality. We derive guidelines for practitioners conducting text-to-image evaluation, discuss open challenges of evaluation mechanisms, and surface limitations of current metrics.
arXiv Detail & Related papers (2024-03-18T14:24:20Z)
Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image Synthesis [21.619269792415903]
We present an empirical study introducing a nuanced evaluation framework for text-to-image (T2I) generative models. Our framework categorizes evaluations into two distinct groups: first, focusing on image qualities such as aesthetics and realism, and second, examining text conditions through concept coverage and fairness.
arXiv Detail & Related papers (2024-03-08T07:41:47Z)
Improving Synthetically Generated Image Detection in Cross-Concept Settings [20.21594285488186]
We focus on the challenge of generalizing across different concept classes, e.g., when training a detector on human faces. We propose an approach based on the premise that the robustness of the detector can be enhanced by training it on realistic synthetic images.
arXiv Detail & Related papers (2023-04-24T12:45:00Z)
IRGen: Generative Modeling for Image Retrieval [82.62022344988993]
In this paper, we present a novel methodology, reframing image retrieval as a variant of generative modeling. We develop our model, dubbed IRGen, to address the technical challenge of converting an image into a concise sequence of semantic units. Our model achieves state-of-the-art performance on three widely-used image retrieval benchmarks and two million-scale datasets.
arXiv Detail & Related papers (2023-03-17T17:07:36Z)
Image Quality Assessment in the Modern Age [53.19271326110551]
This tutorial provides the audience with the basic theories, methodologies, and current progresses of image quality assessment (IQA) We will first revisit several subjective quality assessment methodologies, with emphasis on how to properly select visual stimuli. Both hand-engineered and (deep) learning-based methods will be covered.
arXiv Detail & Related papers (2021-10-19T02:38:46Z)
Identity-Aware CycleGAN for Face Photo-Sketch Synthesis and Recognition [61.87842307164351]
We first propose an Identity-Aware CycleGAN (IACycleGAN) model that applies a new perceptual loss to supervise the image generation network. It improves CycleGAN on photo-sketch synthesis by paying more attention to the synthesis of key facial regions, such as eyes and nose. We develop a mutual optimization procedure between the synthesis model and the recognition model, which iteratively synthesizes better images by IACycleGAN.
arXiv Detail & Related papers (2021-03-30T01:30:08Z)
NPRportrait 1.0: A Three-Level Benchmark for Non-Photorealistic Rendering of Portraits [67.58044348082944]
This paper proposes a new structured, three level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. A new methodology has been developed for evaluating portrait stylisation algorithms.
arXiv Detail & Related papers (2020-09-01T18:04:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.