DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual
Design
- URL: http://arxiv.org/abs/2310.15144v1
- Date: Mon, 23 Oct 2023 17:48:38 GMT
- Title: DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual
Design
- Authors: Kevin Lin, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Lijuan Wang
- Abstract summary: We introduce DEsignBench, a text-to-image (T2I) generation benchmark tailored for visual design scenarios.
For DEsignBench benchmarking, we perform human evaluations on generated images against the criteria of image-text alignment, visual aesthetic, and design creativity.
In addition to human evaluations, we introduce the first automatic image generation evaluator powered by GPT-4V.
- Score: 124.56730013968543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce DEsignBench, a text-to-image (T2I) generation benchmark tailored
for visual design scenarios. Recent T2I models like DALL-E 3 and others, have
demonstrated remarkable capabilities in generating photorealistic images that
align closely with textual inputs. While the allure of creating visually
captivating images is undeniable, our emphasis extends beyond mere aesthetic
pleasure. We aim to investigate the potential of using these powerful models in
authentic design contexts. In pursuit of this goal, we develop DEsignBench,
which incorporates test samples designed to assess T2I models on both "design
technical capability" and "design application scenario." Each of these two
dimensions is supported by a diverse set of specific design categories. We
explore DALL-E 3 together with other leading T2I models on DEsignBench,
resulting in a comprehensive visual gallery for side-by-side comparisons. For
DEsignBench benchmarking, we perform human evaluations on generated images in
DEsignBench gallery, against the criteria of image-text alignment, visual
aesthetic, and design creativity. Our evaluation also considers other
specialized design capabilities, including text rendering, layout composition,
color harmony, 3D design, and medium style. In addition to human evaluations,
we introduce the first automatic image generation evaluator powered by GPT-4V.
This evaluator provides ratings that align well with human judgments, while
being easily replicable and cost-efficient. A high-resolution version is
available at
https://github.com/design-bench/design-bench.github.io/raw/main/designbench.pdf?download=
Related papers
- Inkspire: Supporting Design Exploration with Generative AI through Analogical Sketching [16.33879333386818]
Inkspire is a sketch-driven tool that supports designers in prototyping product design concepts.
In a study comparing Inkspire to ControlNet, we found that Inkspire supported designers with more inspiration and exploration of design ideas.
arXiv Detail & Related papers (2025-01-30T18:59:04Z) - IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models [52.73820275861131]
Text-to-image(T2I) models have made significant progress, showcasing impressive abilities in prompt following and image generation.
Recent models such as FLUX.1 and Ideogram2.0 have demonstrated exceptional performance across various complex tasks.
This study provides valuable insights into the current state and future trajectory of T2I models as they evolve towards general-purpose usability.
arXiv Detail & Related papers (2025-01-23T18:58:33Z) - Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping [55.98643055756135]
We introduce Sketch2Code, a benchmark that evaluates state-of-the-art Vision Language Models (VLMs) on automating the conversion of rudimentary sketches into webpage prototypes.
We analyze ten commercial and open-source models, showing that Sketch2Code is challenging for existing VLMs.
A user study with UI/UX experts reveals a significant preference for proactive question-asking over passive feedback reception.
arXiv Detail & Related papers (2024-10-21T17:39:49Z) - KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities [93.74881034001312]
We conduct a systematic study on the fidelity of entities in text-to-image generation models.
We focus on their ability to generate a wide range of real-world visual entities, such as landmark buildings, aircraft, plants, and animals.
Our findings reveal that even the most advanced text-to-image models often fail to generate entities with accurate visual details.
arXiv Detail & Related papers (2024-10-15T17:50:37Z) - ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty [52.15933752463479]
ConceptMix is a scalable, controllable, and customizable benchmark.
It automatically evaluates compositional generation ability of Text-to-Image (T2I) models.
It reveals that the performance of several models, especially open models, drops dramatically with increased k.
arXiv Detail & Related papers (2024-08-26T15:08:12Z) - I-Design: Personalized LLM Interior Designer [57.00412237555167]
I-Design is a personalized interior designer that allows users to generate and visualize their design goals through natural language communication.
I-Design starts with a team of large language model agents that engage in dialogues and logical reasoning with one another.
The final design is then constructed in 3D by retrieving and integrating assets from an existing object database.
arXiv Detail & Related papers (2024-04-03T16:17:53Z) - HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image
Models [39.38477117444303]
HRS-Bench is an evaluation benchmark for Text-to-Image (T2I) models.
It measures 13 skills that can be categorized into five major categories: accuracy, robustness, generalization, fairness, and bias.
It covers 50 scenarios, including fashion, animals, transportation, food, and clothes.
arXiv Detail & Related papers (2023-04-11T17:59:13Z) - Evaluation of Sketch-Based and Semantic-Based Modalities for Mockup
Generation [15.838427479984926]
Design mockups are essential instruments for visualizing and testing design ideas.
We present and evaluate two different modalities for generating mockups based on hand-drawn sketches.
Our results show that sketch-based generation was more intuitive and expressive, while semantic-based generative AI obtained better results in terms of quality and fidelity.
arXiv Detail & Related papers (2023-03-22T16:47:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.