Evaluation Metrics for Automated Typographic Poster Generation
- URL: http://arxiv.org/abs/2402.06945v1
- Date: Sat, 10 Feb 2024 13:18:10 GMT
- Title: Evaluation Metrics for Automated Typographic Poster Generation
- Authors: S\'ergio M. Rebelo, J. J. Merelo, Jo\~ao Bicker, Penousal Machado
- Abstract summary: We propose a set of metrics for typographic design evaluation, focusing on their legibility.
We also integrate emotion recognition to identify text semantics automatically and analyse the performance of the approach.
- Score: 0.24578723416255752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational Design approaches facilitate the generation of typographic
design, but evaluating these designs remains a challenging task. In this paper,
we propose a set of heuristic metrics for typographic design evaluation,
focusing on their legibility, which assesses the text visibility, aesthetics,
which evaluates the visual quality of the design, and semantic features, which
estimate how effectively the design conveys the content semantics. We
experiment with a constrained evolutionary approach for generating typographic
posters, incorporating the proposed evaluation metrics with varied setups, and
treating the legibility metrics as constraints. We also integrate emotion
recognition to identify text semantics automatically and analyse the
performance of the approach and the visual characteristics outputs.
Related papers
- Towards More Accurate Personalized Image Generation: Addressing Overfitting and Evaluation Bias [52.590072198551944]
The aim of image personalization is to create images based on a user-provided subject.
Current methods face challenges in ensuring fidelity to the text prompt.
We introduce a novel training pipeline that incorporates an attractor to filter out distractions in training images.
arXiv Detail & Related papers (2025-03-09T14:14:02Z) - DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models [115.62816053600085]
We present DesignDiffusion, a framework for synthesizing design images from textual descriptions.
The proposed framework directly synthesizes textual and visual design elements from user prompts.
It utilizes a distinctive character embedding derived from the visual text to enhance the input prompt.
arXiv Detail & Related papers (2025-03-03T15:22:57Z) - Explaining Automatic Image Assessment [2.8084422332394428]
Our proposed approach attempts to explain aesthetic assessment models through visualizing dataset trends and automatic categorization of visual aesthetic features.
By evaluating the models adapted to each specific modality using existing and novel metrics, we can capture and visualize aesthetic features and trends.
arXiv Detail & Related papers (2025-02-03T22:55:14Z) - PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides [51.88536367177796]
We propose a two-stage, edit-based approach inspired by human drafts for automatically generating presentations.
PWTAgent first analyzes references to extract slide-level functional types and content schemas, then generates editing actions based on selected reference slides.
PWTAgent significantly outperforms existing automatic presentation generation methods across all three dimensions.
arXiv Detail & Related papers (2025-01-07T16:53:01Z) - HMGIE: Hierarchical and Multi-Grained Inconsistency Evaluation for Vision-Language Data Cleansing [54.970275599061594]
We design an adaptive evaluation framework, called Hierarchical and Multi-Grained Inconsistency Evaluation (HMGIE)
HMGIE can provide multi-grained evaluations covering both accuracy and completeness for various image-caption pairs.
To verify the efficacy and flexibility of the proposed framework, we construct MVTID, an image-caption dataset with diverse types and granularities of inconsistencies.
arXiv Detail & Related papers (2024-12-07T15:47:49Z) - Design-o-meter: Towards Evaluating and Refining Graphic Designs [11.416650723712968]
We introduce Design-o-meter, a data-driven methodology to quantify the goodness of graphic designs.
To the best of our knowledge, Design-o-meter is the first approach that scores and refines designs in a unified framework.
arXiv Detail & Related papers (2024-11-22T14:17:46Z) - TypeScore: A Text Fidelity Metric for Text-to-Image Generative Models [39.06617653124486]
We introduce a new evaluation framework called TypeScore to assess a model's ability to generate images with high-fidelity embedded text.
Our proposed metric demonstrates greater resolution than CLIPScore to differentiate popular image generation models.
arXiv Detail & Related papers (2024-11-02T07:56:54Z) - signwriting-evaluation: Effective Sign Language Evaluation via SignWriting [3.484261625026626]
This paper introduces a comprehensive suite of evaluation metrics specifically designed for SignWriting.
We address the challenges of evaluating single signs versus continuous signing.
Our findings reveal the strengths and limitations of each metric, offering valuable insights for future advancements.
arXiv Detail & Related papers (2024-10-17T15:28:45Z) - KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities [93.74881034001312]
We conduct a systematic study on the fidelity of entities in text-to-image generation models.
We focus on their ability to generate a wide range of real-world visual entities, such as landmark buildings, aircraft, plants, and animals.
Our findings reveal that even the most advanced text-to-image models often fail to generate entities with accurate visual details.
arXiv Detail & Related papers (2024-10-15T17:50:37Z) - MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis [65.78359025027457]
MetaDesigner revolutionizes artistic typography by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement.
A comprehensive feedback mechanism harnesses insights from multimodal models and user evaluations to refine and enhance the design process iteratively.
Empirical validations highlight MetaDesigner's capability to effectively serve diverse WordArt applications, consistently producing aesthetically appealing and context-sensitive results.
arXiv Detail & Related papers (2024-06-28T11:58:26Z) - The Cognitive Type Project -- Mapping Typography to Cognition [1.0878040851638]
The Cognitive Type Project is focused on developing computational tools to enable the design of typefaces with varying cognitive properties.
This initiative aims to empower typographers to craft fonts that enhance click-through rates for online ads, improve reading levels in children's books, and enable dyslexics to create personalized type.
arXiv Detail & Related papers (2024-03-06T22:32:49Z) - Vision Language Model-based Caption Evaluation Method Leveraging Visual
Context Extraction [27.00018283430169]
This paper presents VisCE$2$, a vision language model-based caption evaluation method.
Our method focuses on visual context, which refers to the detailed content of images, including objects, attributes, and relationships.
arXiv Detail & Related papers (2024-02-28T01:29:36Z) - X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic
Textual Guidance [70.08635216710967]
X-Mesh is a text-driven 3D stylization framework that incorporates a novel Text-guided Dynamic Attention Module.
We introduce a new standard text-mesh benchmark, MIT-30, and two automated metrics, which will enable future research to achieve fair and objective comparisons.
arXiv Detail & Related papers (2023-03-28T06:45:31Z) - Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models [48.77653835765705]
We introduce a probabilistic resolution to prompt tuning, where the label-specific prompts are generated hierarchically by first sampling a latent vector from an underlying distribution and then employing a lightweight generative model.
We evaluate the effectiveness of our approach on four tasks: few-shot image recognition, base-to-new generalization, dataset transfer learning, and domain shifts.
arXiv Detail & Related papers (2023-03-16T06:09:15Z) - Composition and Style Attributes Guided Image Aesthetic Assessment [66.60253358722538]
We propose a method for the automatic prediction of the aesthetics of an image.
The proposed network includes: a pre-trained network for semantic features extraction (the Backbone); a Multi Layer Perceptron (MLP) network that relies on the Backbone features for the prediction of image attributes (the AttributeNet)
Given an image, the proposed multi-network is able to predict: style and composition attributes, and aesthetic score distribution.
arXiv Detail & Related papers (2021-11-08T17:16:38Z) - Matching Visual Features to Hierarchical Semantic Topics for Image
Paragraph Captioning [50.08729005865331]
This paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework.
To capture the correlations between the image and text at multiple levels of abstraction, we design a variational inference network.
To guide the paragraph generation, the learned hierarchical topics and visual features are integrated into the language model.
arXiv Detail & Related papers (2021-05-10T06:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.