Generating Chinese Poetry from Images via Concrete and Abstract
Information
- URL: http://arxiv.org/abs/2003.10773v1
- Date: Tue, 24 Mar 2020 11:17:20 GMT
- Title: Generating Chinese Poetry from Images via Concrete and Abstract
Information
- Authors: Yusen Liu, Dayiheng Liu, Jiancheng Lv, Yongsheng Sang
- Abstract summary: We propose an infilling-based Chinese poetry generation model which can infill the Concrete keywords into each line of poems in an explicit way.
We also use non-parallel data during training and construct separate image datasets and poem datasets to train the different components in our framework.
Both automatic and human evaluation results show that our approach can generate poems which have better consistency with images without losing the quality.
- Score: 23.690384629376005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, the automatic generation of classical Chinese poetry has
made great progress. Besides focusing on improving the quality of the generated
poetry, there is a new topic about generating poetry from an image. However,
the existing methods for this topic still have the problem of topic drift and
semantic inconsistency, and the image-poem pairs dataset is hard to be built
when training these models. In this paper, we extract and integrate the
Concrete and Abstract information from images to address those issues. We
proposed an infilling-based Chinese poetry generation model which can infill
the Concrete keywords into each line of poems in an explicit way, and an
abstract information embedding to integrate the Abstract information into
generated poems. In addition, we use non-parallel data during training and
construct separate image datasets and poem datasets to train the different
components in our framework. Both automatic and human evaluation results show
that our approach can generate poems which have better consistency with images
without losing the quality.
Related papers
- Semi-supervised Chinese Poem-to-Painting Generation via Cycle-consistent Adversarial Networks [2.250406890348191]
We propose a semi-supervised approach using cycle-consistent adversarial networks to leverage the limited paired data.
We introduce novel evaluation metrics to assess the quality, diversity, and consistency of the generated poems and paintings.
The proposed model outperforms previous methods, showing promise in capturing the symbolic essence of artistic expression.
arXiv Detail & Related papers (2024-10-25T04:57:44Z) - Poetry2Image: An Iterative Correction Framework for Images Generated from Chinese Classical Poetry [7.536700229966157]
Poetry2Image is an iterative correction framework for images generated from Chinese classical poetry.
The proposed method achieves an average element completeness of 70.63%, representing an improvement of 25.56% over direct image generation.
arXiv Detail & Related papers (2024-06-15T19:45:08Z) - Towards Retrieval-Augmented Architectures for Image Captioning [81.11529834508424]
This work presents a novel approach towards developing image captioning models that utilize an external kNN memory to improve the generation process.
Specifically, we propose two model variants that incorporate a knowledge retriever component that is based on visual similarities.
We experimentally validate our approach on COCO and nocaps datasets and demonstrate that incorporating an explicit external memory can significantly enhance the quality of captions.
arXiv Detail & Related papers (2024-05-21T18:02:07Z) - Training-Free Consistent Text-to-Image Generation [80.4814768762066]
Text-to-image models can portray the same subject across diverse prompts.
Existing approaches fine-tune the model to teach it new words that describe specific user-provided subjects.
We present ConsiStory, a training-free approach that enables consistent subject generation by sharing the internal activations of the pretrained model.
arXiv Detail & Related papers (2024-02-05T18:42:34Z) - Visual Storytelling with Question-Answer Plans [70.89011289754863]
We present a novel framework which integrates visual representations with pretrained language models and planning.
Our model translates the image sequence into a visual prefix, a sequence of continuous embeddings which language models can interpret.
It also leverages a sequence of question-answer pairs as a blueprint plan for selecting salient visual concepts and determining how they should be assembled into a narrative.
arXiv Detail & Related papers (2023-10-08T21:45:34Z) - Language Does More Than Describe: On The Lack Of Figurative Speech in
Text-To-Image Models [63.545146807810305]
Text-to-image diffusion models can generate high-quality pictures from textual input prompts.
These models have been trained using text data collected from content-based labelling protocols.
We characterise the sentimentality, objectiveness and degree of abstraction of publicly available text data used to train current text-to-image diffusion models.
arXiv Detail & Related papers (2022-10-19T14:20:05Z) - Zero-shot Sonnet Generation with Discourse-level Planning and Aesthetics
Features [37.45490765899826]
We present a novel framework to generate sonnets that does not require training on poems.
Specifically, a content planning module is trained on non-poetic texts to obtain discourse-level coherence.
We also design a constrained decoding algorithm to impose the meter-and-rhyme constraint of the generated sonnets.
arXiv Detail & Related papers (2022-05-03T23:44:28Z) - CCPM: A Chinese Classical Poetry Matching Dataset [50.90794811956129]
We propose a novel task to assess a model's semantic understanding of poetry by poem matching.
This task requires the model to select one line of Chinese classical poetry among four candidates according to the modern Chinese translation of a line of poetry.
To construct this dataset, we first obtain a set of parallel data of Chinese classical poetry and modern Chinese translation.
arXiv Detail & Related papers (2021-06-03T16:49:03Z) - Matching Visual Features to Hierarchical Semantic Topics for Image
Paragraph Captioning [50.08729005865331]
This paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework.
To capture the correlations between the image and text at multiple levels of abstraction, we design a variational inference network.
To guide the paragraph generation, the learned hierarchical topics and visual features are integrated into the language model.
arXiv Detail & Related papers (2021-05-10T06:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.