Related papers: Poetry2Image: An Iterative Correction Framework for Images Generated from Chinese Classical Poetry

Poetry2Image: An Iterative Correction Framework for Images Generated from Chinese Classical Poetry

URL: http://arxiv.org/abs/2407.06196v1
Date: Sat, 15 Jun 2024 19:45:08 GMT
Title: Poetry2Image: An Iterative Correction Framework for Images Generated from Chinese Classical Poetry
Authors: Jing Jiang, Yiran Ling, Binzhu Li, Pengxiang Li, Junming Piao, Yu Zhang,
Abstract summary: Poetry2Image is an iterative correction framework for images generated from Chinese classical poetry. The proposed method achieves an average element completeness of 70.63%, representing an improvement of 25.56% over direct image generation.
Score: 7.536700229966157
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-image generation models often struggle with key element loss or semantic confusion in tasks involving Chinese classical poetry.Addressing this issue through fine-tuning models needs considerable training costs. Additionally, manual prompts for re-diffusion adjustments need professional knowledge. To solve this problem, we propose Poetry2Image, an iterative correction framework for images generated from Chinese classical poetry. Utilizing an external poetry dataset, Poetry2Image establishes an automated feedback and correction loop, which enhances the alignment between poetry and image through image generation models and subsequent re-diffusion modifications suggested by large language models (LLM). Using a test set of 200 sentences of Chinese classical poetry, the proposed method--when integrated with five popular image generation models--achieves an average element completeness of 70.63%, representing an improvement of 25.56% over direct image generation. In tests of semantic correctness, our method attains an average semantic consistency of 80.09%. The study not only promotes the dissemination of ancient poetry culture but also offers a reference for similar non-fine-tuning methods to enhance LLM generation.

Related papers

PoemTale Diffusion: Minimising Information Loss in Poem to Image Generation with Multi-Stage Prompt Refinement [18.293592213622183]
PoemTale Diffusion aims to minimise the information that is lost during poetic text-to-image conversion.<n>To support this, we adapt existing state-of-the-art diffusion models by modifying their self-attention mechanisms.<n>To encourage research in the field of poetry, we introduce the P4I dataset, consisting of 1111 poems.
arXiv Detail & Related papers (2025-07-18T07:33:08Z)
RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning [88.14234949860105]
RePrompt is a novel reprompting framework that introduces explicit reasoning into the prompt enhancement process via reinforcement learning.<n>Our approach enables end-to-end training without human-annotated data.
arXiv Detail & Related papers (2025-05-23T06:44:26Z)
Culture-TRIP: Culturally-Aware Text-to-Image Generation with Iterative Prompt Refinment [2.089922606370409]
We propose a novel approach, Culturally-Aware Text-to-Image Generation with Iterative Prompt Refinement (Culture-TRIP) Our approach retrieves cultural contexts and visual details related to the culture nouns in the prompt. It iteratively refines and evaluates the prompt based on a set of cultural criteria and large language models.
arXiv Detail & Related papers (2025-02-24T06:56:56Z)
Poetry in Pixels: Prompt Tuning for Poem Image Generation via Diffusion Models [18.293592213622183]
We propose a PoemToPixel framework designed to generate images that visually represent the inherent meanings of poems. Our approach incorporates the concept of prompt tuning in our image generation framework to ensure that the resulting images closely align with the poetic content. To expand the diversity of the poetry dataset, we introduce MiniPo, a novel multimodal dataset comprising 1001 children's poems and images.
arXiv Detail & Related papers (2025-01-10T10:26:54Z)
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation [70.95783968368124]
We introduce a novel multi-modal autoregressive model, dubbed $textbfInstaManip$. We propose an innovative group self-attention mechanism to break down the in-context learning process into two separate stages. Our method surpasses previous few-shot image manipulation models by a notable margin.
arXiv Detail & Related papers (2024-12-02T01:19:21Z)
Information Theoretic Text-to-Image Alignment [49.396917351264655]
We present a novel method that relies on an information-theoretic alignment measure to steer image generation. Our method is on-par or superior to the state-of-the-art, yet requires nothing but a pre-trained denoising network to estimate MI.
arXiv Detail & Related papers (2024-05-31T12:20:02Z)
Direct Consistency Optimization for Compositional Text-to-Image Personalization [73.94505688626651]
Text-to-image (T2I) diffusion models, when fine-tuned on a few personal images, are able to generate visuals with a high degree of consistency. We propose to fine-tune the T2I model by maximizing consistency to reference images, while penalizing the deviation from the pretrained model.
arXiv Detail & Related papers (2024-02-19T09:52:41Z)
Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack [75.00066365801993]
Training text-to-image models with web scale image-text pairs enables the generation of a wide range of visual concepts from text. These pre-trained models often face challenges when it comes to generating highly aesthetic images. We propose quality-tuning to guide a pre-trained model to exclusively generate highly visually appealing images.
arXiv Detail & Related papers (2023-09-27T17:30:19Z)
A Method to Judge the Style of Classical Poetry Based on Pre-trained Model [13.899056358137287]
This paper builds the most perfect data set of Chinese classical poetry at present, trains a BART-poem pre-trained model on this data set, and puts forward a generally applicable poetry style judgment method. Experiments show that the judgment results of the tested poetry work are basically consistent with the conclusions given by critics of previous dynasties, verify some avant-garde judgments of Mr. Qian Zhongshu, and better solve the task of poetry style recognition in the Tang and Song dynasties.
arXiv Detail & Related papers (2022-11-09T03:11:15Z)
Prose2Poem: The Blessing of Transformers in Translating Prose to Persian Poetry [2.15242029196761]
We introduce a novel Neural Machine Translation (NMT) approach to translate prose to ancient Persian poetry. We trained a Transformer model from scratch to obtain initial translations and pretrained different variations of BERT to obtain final translations.
arXiv Detail & Related papers (2021-09-30T09:04:11Z)
Caption Enriched Samples for Improving Hateful Memes Detection [78.5136090997431]
The hateful meme challenge demonstrates the difficulty of determining whether a meme is hateful or not. Both unimodal language models and multimodal vision-language models cannot reach the human level of performance.
arXiv Detail & Related papers (2021-09-22T10:57:51Z)
CCPM: A Chinese Classical Poetry Matching Dataset [50.90794811956129]
We propose a novel task to assess a model's semantic understanding of poetry by poem matching. This task requires the model to select one line of Chinese classical poetry among four candidates according to the modern Chinese translation of a line of poetry. To construct this dataset, we first obtain a set of parallel data of Chinese classical poetry and modern Chinese translation.
arXiv Detail & Related papers (2021-06-03T16:49:03Z)
Generating Chinese Poetry from Images via Concrete and Abstract Information [23.690384629376005]
We propose an infilling-based Chinese poetry generation model which can infill the Concrete keywords into each line of poems in an explicit way. We also use non-parallel data during training and construct separate image datasets and poem datasets to train the different components in our framework. Both automatic and human evaluation results show that our approach can generate poems which have better consistency with images without losing the quality.
arXiv Detail & Related papers (2020-03-24T11:17:20Z)
Generating Major Types of Chinese Classical Poetry in a Uniformed Framework [88.57587722069239]
We propose a GPT-2 based framework for generating major types of Chinese classical poems. Preliminary results show this enhanced model can generate Chinese classical poems of major types with high quality in both form and content.
arXiv Detail & Related papers (2020-03-13T14:16:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.