Related papers: Graphic Design with Large Multimodal Model

Graphic Design with Large Multimodal Model

URL: http://arxiv.org/abs/2404.14368v1
Date: Mon, 22 Apr 2024 17:20:38 GMT
Title: Graphic Design with Large Multimodal Model
Authors: Yutao Cheng, Zhao Zhang, Maoke Yang, Hui Nie, Chunyuan Li, Xinglong Wu, Jie Shao,
Abstract summary: Hierarchical Layout Generation (HLG) is a more flexible and pragmatic setup, which creates graphic composition from unordered sets of design elements. To tackle the HLG task, we introduce Graphist, the first layout generation model based on large multimodal models. Graphist efficiently reframes the HLG as a sequence generation problem, utilizing RGB-A images as input.
Score: 38.96206668552293
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the field of graphic design, automating the integration of design elements into a cohesive multi-layered artwork not only boosts productivity but also paves the way for the democratization of graphic design. One existing practice is Graphic Layout Generation (GLG), which aims to layout sequential design elements. It has been constrained by the necessity for a predefined correct sequence of layers, thus limiting creative potential and increasing user workload. In this paper, we present Hierarchical Layout Generation (HLG) as a more flexible and pragmatic setup, which creates graphic composition from unordered sets of design elements. To tackle the HLG task, we introduce Graphist, the first layout generation model based on large multimodal models. Graphist efficiently reframes the HLG as a sequence generation problem, utilizing RGB-A images as input, outputs a JSON draft protocol, indicating the coordinates, size, and order of each element. We develop new evaluation metrics for HLG. Graphist outperforms prior arts and establishes a strong baseline for this field. Project homepage: https://github.com/graphic-design-ai/graphist

Related papers

IGD: Instructional Graphic Design with Multimodal Layer Generation [83.31320209596991]
Two-stage methods that rely primarily on layout generation lack creativity and intelligence, making graphic design still labor-intensive.<n>We propose instructional graphic designer (IGD) to swiftly generate multimodal layers with editable flexibility with only natural language instructions.
arXiv Detail & Related papers (2025-07-14T04:31:15Z)
Rethinking Layered Graphic Design Generation with a Top-Down Approach [76.33538798060326]
Graphic design is crucial for conveying ideas and messages. Designers usually organize their work into objects, backgrounds, and vectorized text layers to simplify editing.<n>With the rise of GenAI methods, an endless supply of high-quality graphic designs in pixel format has become more accessible.<n>Despite this, non-layered designs still inspire human designers, influencing their choices in layouts and text styles, ultimately guiding the creation of layered designs.<n>Motivated by this observation, we propose Accordion, a graphic design generation framework taking the first attempt to convert AI-generated designs into editable layered designs.
arXiv Detail & Related papers (2025-07-08T02:26:08Z)
CreatiPoster: Towards Editable and Controllable Multi-Layer Graphic Design Generation [13.354283356097563]
CreatiPoster is a framework that generates editable, multilayer compositions from optional natural-language instructions or assets.<n>To further research, we release a copyright-free corpus of 100,000 multi-layer designs.
arXiv Detail & Related papers (2025-06-12T16:54:39Z)
GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts [53.568057283934714]
We propose a VLM-based framework that generates content-aware text logo layouts. We introduce two model techniques to reduce the computation for processing multiple glyph images simultaneously. To support instruction-tuning of out model, we construct two extensive text logo datasets, which are 5x more larger than the existing public dataset.
arXiv Detail & Related papers (2024-11-18T10:04:10Z)
Large Generative Graph Models [74.58859158271169]
We propose a new class of graph generative model called Large Graph Generative Model (LGGM) The pre-trained LGGM has superior zero-shot generative capability to existing graph generative models. LGGM can be easily fine-tuned with graphs from target domains and demonstrate even better performance than those directly trained from scratch.
arXiv Detail & Related papers (2024-06-07T17:41:47Z)
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation. Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts. We conduct extensive experiments and achieved state-of-the-art (SOTA) performance on public multi-modal layout generation benchmarks.
arXiv Detail & Related papers (2024-06-05T03:05:52Z)
COLE: A Hierarchical Generation Framework for Multi-Layered and Editable Graphic Design [39.809852329070466]
This paper introduces the COLE system - a hierarchical generation framework designed to address these challenges. This COLE system can transform a vague intention prompt into a high-quality multi-layered graphic design, while also supporting flexible editing based on user input.
arXiv Detail & Related papers (2023-11-28T17:22:17Z)
PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout [62.12447593298437]
Content-aware visual-textual presentation layout aims at arranging spatial space on the given canvas for pre-defined elements. We propose design sequence formation (DSF) that reorganizes elements in layouts to imitate the design processes of human designers. A novel CNN-LSTM-based conditional generative adversarial network (GAN) is presented to generate proper layouts.
arXiv Detail & Related papers (2023-03-28T12:48:36Z)
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer [80.61492265221817]
Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs is skill-demanding, time-consuming, and non-scalable to batch production. Generative models emerge to make design automation scalable but it remains non-trivial to produce designs that comply with designers' desires.
arXiv Detail & Related papers (2022-12-19T21:57:35Z)
The Layout Generation Algorithm of Graphic Design Based on Transformer-CVAE [8.052709336750823]
This paper implemented the Transformer model and conditional variational autoencoder (CVAE) to the graphic design layout generation task. It proposed an end-to-end graphic design layout generation model named LayoutT-CVAE. Compared with the existing state-of-art models, the layout generated by ours performs better on many metrics.
arXiv Detail & Related papers (2021-10-08T13:36:02Z)
Constrained Graphic Layout Generation via Latent Optimization [17.05026043385661]
We generate graphic layouts that can flexibly incorporate design semantics, either specified implicitly or explicitly by a user. Our approach builds on a generative layout model based on a Transformer architecture, and formulates the layout generation as a constrained optimization problem. We show in the experiments that our approach is capable of generating realistic layouts in both constrained and unconstrained generation tasks with a single model.
arXiv Detail & Related papers (2021-08-02T13:04:11Z)
GRIDS: Interactive Layout Design with Integer Programming [25.88822318048848]
This paper proposes a novel optimisation approach for the generation of grid-based layouts. Our mixed integer linear programming (MILP) model offers a rigorous yet efficient method for grid generation. We present techniques for interactive diversification, enhancement, and completion of grid layouts.
arXiv Detail & Related papers (2020-01-09T11:08:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.