TextLap: Customizing Language Models for Text-to-Layout Planning
- URL: http://arxiv.org/abs/2410.12844v1
- Date: Wed, 09 Oct 2024 19:51:38 GMT
- Title: TextLap: Customizing Language Models for Text-to-Layout Planning
- Authors: Jian Chen, Ruiyi Zhang, Yufan Zhou, Jennifer Healey, Jiuxiang Gu, Zhiqiang Xu, Changyou Chen,
- Abstract summary: We call our method TextLap (text-based layout planning)
It uses a curated instruction-based layout planning dataset (InsLap) to customize Large language models (LLMs) as a graphic designer.
We demonstrate the effectiveness of TextLap and show that it outperforms strong baselines, including GPT-4 based methods, for image generation and graphical design benchmarks.
- Score: 65.02105936609021
- License:
- Abstract: Automatic generation of graphical layouts is crucial for many real-world applications, including designing posters, flyers, advertisements, and graphical user interfaces. Given the incredible ability of Large language models (LLMs) in both natural language understanding and generation, we believe that we could customize an LLM to help people create compelling graphical layouts starting with only text instructions from the user. We call our method TextLap (text-based layout planning). It uses a curated instruction-based layout planning dataset (InsLap) to customize LLMs as a graphic designer. We demonstrate the effectiveness of TextLap and show that it outperforms strong baselines, including GPT-4 based methods, for image generation and graphical design benchmarks.
Related papers
- Customizing Large Language Model Generation Style using Parameter-Efficient Finetuning [24.263699489328427]
One-size-fits-all large language models (LLMs) are increasingly being used to help people with their writing.
This paper explores whether parameter-efficient finetuning (PEFT) with Low-Rank Adaptation can effectively guide the style of LLM generations.
arXiv Detail & Related papers (2024-09-06T19:25:18Z) - PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation.
Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts.
We conduct extensive experiments and achieved state-of-the-art (SOTA) performance on public multi-modal layout generation benchmarks.
arXiv Detail & Related papers (2024-06-05T03:05:52Z) - Distilling Large Language Models for Text-Attributed Graph Learning [16.447635770220334]
Text-Attributed Graphs (TAGs) are graphs of connected textual documents.
Graph models can efficiently learn TAGs, but their training heavily relies on human-annotated labels.
Large language models (LLMs) have recently demonstrated remarkable capabilities in few-shot and zero-shot TAG learning.
arXiv Detail & Related papers (2024-02-19T10:31:53Z) - Large Language Models on Graphs: A Comprehensive Survey [77.16803297418201]
We provide a systematic review of scenarios and techniques related to large language models on graphs.
We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs.
We discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets.
arXiv Detail & Related papers (2023-12-05T14:14:27Z) - A Parse-Then-Place Approach for Generating Graphic Layouts from Textual
Descriptions [50.469491454128246]
We use text as the guidance to create graphic layouts, i.e., Text-to-labeled, aiming to lower the design barriers.
Text-to-labeled is a challenging task, because it needs to consider the implicit, combined, and incomplete constraints from text.
We present a two-stage approach, named parse-then-place, to address this problem.
arXiv Detail & Related papers (2023-08-24T10:37:00Z) - Painter: Teaching Auto-regressive Language Models to Draw Sketches [5.3445140425713245]
We present Painter, an LLM that can convert user prompts in text description format to sketches.
We create a dataset of diverse multi-object sketches paired with textual prompts.
Although this is an unprecedented pioneering work in using LLMs for auto-regressive image generation, the results are very encouraging.
arXiv Detail & Related papers (2023-08-16T17:18:30Z) - LayoutGPT: Compositional Visual Planning and Generation with Large
Language Models [98.81962282674151]
Large Language Models (LLMs) can serve as visual planners by generating layouts from text conditions.
We propose LayoutGPT, a method to compose in-context visual demonstrations in style sheet language.
arXiv Detail & Related papers (2023-05-24T17:56:16Z) - A Picture is Worth a Thousand Words: Language Models Plan from Pixels [53.85753597586226]
Planning is an important capability of artificial agents that perform long-horizon tasks in real-world environments.
In this work, we explore the use of pre-trained language models (PLMs) to reason about plan sequences from text instructions in embodied visual environments.
arXiv Detail & Related papers (2023-03-16T02:02:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.