Related papers: WordArt Designer: User-Driven Artistic Typography Synthesis using Large Language Models

WordArt Designer: User-Driven Artistic Typography Synthesis using Large Language Models

URL: http://arxiv.org/abs/2310.18332v2
Date: Mon, 27 Nov 2023 04:22:54 GMT
Title: WordArt Designer: User-Driven Artistic Typography Synthesis using Large Language Models
Authors: Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Jingdong Sun, Wangmeng Xiang, Xianhui Lin, Xiaoyang Kang, Zengke Jin, Yusen Hu, Bin Luo, Yifeng Geng, Xuansong Xie and Jingren Zhou
Abstract summary: This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis. The system incorporates four key modules: the LLM Engine, SemTypo, StyTypo, and TexTypo modules. Notably, WordArt Designer highlights the fusion of generative AI with artistic typography.
Score: 43.68826200853858
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis, relying on the Large Language Model (LLM). The system incorporates four key modules: the LLM Engine, SemTypo, StyTypo, and TexTypo modules. 1) The LLM Engine, empowered by the LLM (e.g., GPT-3.5), interprets user inputs and generates actionable prompts for the other modules, thereby transforming abstract concepts into tangible designs. 2) The SemTypo module optimizes font designs using semantic concepts, striking a balance between artistic transformation and readability. 3) Building on the semantic layout provided by the SemTypo module, the StyTypo module creates smooth, refined images. 4) The TexTypo module further enhances the design's aesthetics through texture rendering, enabling the generation of inventive textured fonts. Notably, WordArt Designer highlights the fusion of generative AI with artistic typography. Experience its capabilities on ModelScope: https://www.modelscope.cn/studios/WordArt/WordArt.

Related papers

WordCraft: Interactive Artistic Typography with Attention Awareness and Noise Blending [12.655120187133779]
Artistic typography aims to stylize input characters with visual effects that are both creative and legible.<n>Traditional approaches rely heavily on manual design, while recent generative models, particularly diffusion-based methods, have enabled automated character stylization.<n>We introduce WordCraft, an interactive artistic typography system that integrates diffusion models to address these limitations.
arXiv Detail & Related papers (2025-07-13T10:49:09Z)
POSTA: A Go-to Framework for Customized Artistic Poster Generation [87.16343612086959]
POSTA is a modular framework for customized artistic poster generation. Background Diffusion creates a themed background based on user input. Design MLLM then generates layout and typography elements that align with and complement the background style. ArtText Diffusion applies additional stylization to key text elements.
arXiv Detail & Related papers (2025-03-19T05:22:38Z)
Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art [61.28133495240179]
We propose a novel task of aesthetics alignment which seeks to align user-specified aesthetics with the T2I generation output. Inspired by how artworks provide an invaluable perspective to approach aesthetics, we codify visual aesthetics using the compositional framework artists employ. We demonstrate that T2I DMs can effectively offer 10 compositional controls through user-specified PoA conditions.
arXiv Detail & Related papers (2025-03-15T06:58:09Z)
ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction [32.48036808724505]
We propose a novel method called ArtAug for enhancing text-to-image models. In the interactions, we leverage human preferences implicitly learned by image understanding models to provide fine-grained suggestions. The enhancements brought by the interaction are iteratively fused into the synthesis model itself through an additional enhancement module. Various evaluation metrics consistently demonstrate that ArtAug enhances the generative capabilities of text-to-image models without incurring additional computational costs.
arXiv Detail & Related papers (2024-12-17T13:12:31Z)
GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts [53.568057283934714]
We propose a VLM-based framework that generates content-aware text logo layouts. We introduce two model techniques to reduce the computation for processing multiple glyph images simultaneously. To support instruction-tuning of out model, we construct two extensive text logo datasets, which are 5x more larger than the existing public dataset.
arXiv Detail & Related papers (2024-11-18T10:04:10Z)
VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models [53.59400446543756]
We introduce a dual-branch and training-free method, namely VitaGlyph, to enable flexible artistic typography. VitaGlyph treats input character as a scene composed of Subject and Surrounding, followed by rendering them under varying degrees of geometry transformation. Experimental results demonstrate that VitaGlyph not only achieves better artistry and readability, but also manages to depict multiple customize concepts.
arXiv Detail & Related papers (2024-10-02T16:48:47Z)
EATXT: A textual concrete syntax for EAST-ADL [5.34855193340848]
This paper introduces EATXT, an editor for automotive architecture modeling with EAST-ADL. The EATXT editor is based on Xtext and provides basic and advanced features, such as improved content-assist and serialization. We present the editor features and architecture, the implementation approach, and previous use of EATXT in research.
arXiv Detail & Related papers (2024-07-13T14:05:21Z)
MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis [65.78359025027457]
MetaDesigner revolutionizes artistic typography by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement. A comprehensive feedback mechanism harnesses insights from multimodal models and user evaluations to refine and enhance the design process iteratively. Empirical validations highlight MetaDesigner's capability to effectively serve diverse WordArt applications, consistently producing aesthetically appealing and context-sensitive results.
arXiv Detail & Related papers (2024-06-28T11:58:26Z)
LASER: Tuning-Free LLM-Driven Attention Control for Efficient Text-conditioned Image-to-Animation [62.232361821779335]
We introduce a tuning-free attention control framework, encapsulated by the progressive process of prompt-Aware editing, StablE animation geneRation, abbreviated as LASER. We manipulate the model's spatial features and self-attention mechanisms to maintain animation integrity. Our meticulous control over spatial features and self-attention ensures structural consistency in the images.
arXiv Detail & Related papers (2024-04-21T07:13:56Z)
WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope [43.68826200853858]
This paper introduces the WordArt Designer API, a novel framework for user-driven artistic typography synthesis utilizing Large Language Models (LLMs) on ModelScope. We address the challenge of simplifying artistic typography for non-professionals by offering a dynamic, adaptive, and computationally efficient alternative to traditional rigid templates.
arXiv Detail & Related papers (2024-01-03T12:06:02Z)
GenText: Unsupervised Artistic Text Generation via Decoupled Font and Texture Manipulation [30.654807125764965]
We propose a novel approach, namely GenText, to achieve general artistic text style transfer. Specifically, our work incorporates three different stages, stylization, destylization, and font transfer. Considering the difficult data acquisition of paired artistic text images, our model is designed under the unsupervised setting.
arXiv Detail & Related papers (2022-07-20T04:42:47Z)
Representing ELMo embeddings as two-dimensional text online [5.1525653500591995]
We describe a new addition to the Web embeddings toolkit which is used to serve word embedding models over the Web. The new ELMoViz module adds support for contextualized embedding architectures, in particular for ELMo models. The provided visualizations follow the metaphor of two-dimensional text' by showing lexical substitutes: words which are most semantically similar in context to the words of the input sentence.
arXiv Detail & Related papers (2021-03-30T15:12:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.