GenColorBench: A Color Evaluation Benchmark for Text-to-Image Generation Models
- URL: http://arxiv.org/abs/2510.20586v1
- Date: Thu, 23 Oct 2025 14:12:55 GMT
- Title: GenColorBench: A Color Evaluation Benchmark for Text-to-Image Generation Models
- Authors: Muhammad Atif Butt, Alexandra Gomez-Villa, Tao Wu, Javier Vazquez-Corral, Joost Van De Weijer, Kai Wang,
- Abstract summary: We propose GenColorBench, the first comprehensive benchmark for text-to-image color generation.<n>It is grounded in color systems like I SCC-NBS and CSS3/X11, including numerical colors which are absent elsewhere.<n>With 44K color-focused prompts covering 400+ colors, it reveals models' true capabilities via perceptual and automated assessments.
- Score: 61.786094845872576
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have seen impressive advances in text-to-image generation, with image generative or unified models producing high-quality images from text. Yet these models still struggle with fine-grained color controllability, often failing to accurately match colors specified in text prompts. While existing benchmarks evaluate compositional reasoning and prompt adherence, none systematically assess color precision. Color is fundamental to human visual perception and communication, critical for applications from art to design workflows requiring brand consistency. However, current benchmarks either neglect color or rely on coarse assessments, missing key capabilities such as interpreting RGB values or aligning with human expectations. To this end, we propose GenColorBench, the first comprehensive benchmark for text-to-image color generation, grounded in color systems like ISCC-NBS and CSS3/X11, including numerical colors which are absent elsewhere. With 44K color-focused prompts covering 400+ colors, it reveals models' true capabilities via perceptual and automated assessments. Evaluations of popular text-to-image models using GenColorBench show performance variations, highlighting which color conventions models understand best and identifying failure modes. Our GenColorBench assessments will guide improvements in precise color generation. The benchmark will be made public upon acceptance.
Related papers
- ColorConceptBench: A Benchmark for Probabilistic Color-Concept Understanding in Text-to-Image Models [20.130253460357547]
We introduce ColorConceptBench, a new human-annotated benchmark to evaluate color-concept associations.<n>Our evaluation of seven leading text-to-image (T2I) models reveals that current models lack sensitivity to abstract semantics.<n>This demonstrates that achieving human-like color semantics requires more than larger models.
arXiv Detail & Related papers (2026-01-23T15:36:02Z) - Color Me Correctly: Bridging Perceptual Color Spaces and Text Embeddings for Improved Diffusion Generation [21.37070510103594]
Existing approaches rely on cross-attention manipulation, reference images, or fine-tuning to resolve ambiguous color descriptions.<n>We propose a training-free framework that enhances color fidelity by leveraging a large language model (LLM) to disambiguate color-related prompts.<n>Our method first employs a large language model (LLM) to resolve ambiguous color terms in the text prompt, and then refines the text embeddings based on the spatial relationships of the resulting color terms.
arXiv Detail & Related papers (2025-09-12T08:44:22Z) - Leveraging Semantic Attribute Binding for Free-Lunch Color Control in Diffusion Models [53.73253164099701]
We introduce ColorWave, a training-free approach that achieves exact RGB-level color control in diffusion models without fine-tuning.<n>We demonstrate that ColorWave establishes a new paradigm for structured, color-consistent diffusion-based image synthesis.
arXiv Detail & Related papers (2025-03-12T21:49:52Z) - ColorFlow: Retrieval-Augmented Image Sequence Colorization [65.93834649502898]
We propose a three-stage diffusion-based framework tailored for image sequence colorization in industrial applications.<n>Unlike existing methods that require per-ID finetuning or explicit ID embedding extraction, we propose a novel Retrieval Augmented Colorization pipeline.<n>Our pipeline also features a dual-branch design: one branch for color identity extraction and the other for colorization.
arXiv Detail & Related papers (2024-12-16T14:32:49Z) - Automatic Controllable Colorization via Imagination [55.489416987587305]
We propose a framework for automatic colorization that allows for iterative editing and modifications.
By understanding the content within a grayscale image, we utilize a pre-trained image generation model to generate multiple images that contain the same content.
These images serve as references for coloring, mimicking the process of human experts.
arXiv Detail & Related papers (2024-04-08T16:46:07Z) - Control Color: Multimodal Diffusion-based Interactive Image Colorization [81.68817300796644]
Control Color (Ctrl Color) is a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model.
We present an effective way to encode user strokes to enable precise local color manipulation.
We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring.
arXiv Detail & Related papers (2024-02-16T17:51:13Z) - ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text [5.675944597452309]
We introduce two variations of an image-guided latent diffusion model utilizing different image tokens from the pre-trained CLIP image encoder.
We propose corresponding manipulation methods to adjust their results sequentially using weighted text inputs.
arXiv Detail & Related papers (2024-01-02T22:46:12Z) - DiffColor: Toward High Fidelity Text-Guided Image Colorization with
Diffusion Models [12.897939032560537]
We propose a new method called DiffColor to recover vivid colors conditioned on a prompt text.
We first fine-tune a pre-trained text-to-image model to generate colorized images using a CLIP-based contrastive loss.
Then we try to obtain an optimized text embedding aligning the colorized image and the text prompt, and a fine-tuned diffusion model enabling high-quality image reconstruction.
Our method can produce vivid and diverse colors with a few iterations, and keep the structure and background intact while having colors well-aligned with the target language guidance.
arXiv Detail & Related papers (2023-08-03T09:38:35Z) - TIC: Text-Guided Image Colorization [24.317541784957285]
We propose a novel deep network that takes two inputs (the grayscale image and the respective encoded text description) and tries to predict the relevant color gamut.
As the respective textual descriptions contain color information of the objects present in the scene, the text encoding helps to improve the overall quality of the predicted colors.
We have evaluated our proposed model using different metrics and found that it outperforms the state-of-the-art colorization algorithms both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-08-04T18:40:20Z) - Colour alignment for relative colour constancy via non-standard
references [11.92389176996629]
Relative colour constancy is an essential requirement for many scientific imaging applications.
We propose a colour alignment model that considers the camera image formation as a black-box.
It formulates colour alignment as a three-step process: camera response calibration, response linearisation, and colour matching.
arXiv Detail & Related papers (2021-12-30T15:58:55Z) - Image Colorization: A Survey and Dataset [94.59768013860668]
This article presents a comprehensive survey of state-of-the-art deep learning-based image colorization techniques.
It categorizes the existing colorization techniques into seven classes and discusses important factors governing their performance.
We perform an extensive experimental evaluation of existing image colorization methods using both existing datasets and our proposed one.
arXiv Detail & Related papers (2020-08-25T01:22:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.