Lingxi: A Diversity-aware Chinese Modern Poetry Generation System
- URL: http://arxiv.org/abs/2108.12108v1
- Date: Fri, 27 Aug 2021 03:33:28 GMT
- Title: Lingxi: A Diversity-aware Chinese Modern Poetry Generation System
- Authors: Xinran Zhang, Maosong Sun, Jiafeng Liu, Xiaobing Li
- Abstract summary: Lingxi is a diversity-aware Chinese modern poetry generation system.
We propose nucleus sampling with randomized head (NS-RH) algorithm.
We find that even when a large portion of filtered vocabulary is randomized, it can actually generate fluent poetry.
- Score: 43.36560720793425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Poetry generation has been a difficult task in natural language processing.
Unlike plain neural text generation tasks, poetry has a high requirement for
novelty, since an easily-understood sentence with too many high frequency words
might not be considered as poetic, while adequately ambiguous sentences with
low frequency words can possibly be novel and creative. Inspired by this, we
present Lingxi, a diversity-aware Chinese modern poetry generation system. We
propose nucleus sampling with randomized head (NS-RH) algorithm, which
randomizes the high frequency part ("head") of the predicted distribution, in
order to emphasize on the "comparatively low frequency" words. The proposed
algorithm can significantly increase the novelty of generated poetry compared
with traditional sampling methods. The permutation of distribution is
controllable by tuning the filtering parameter that determines the "head" to
permutate, achieving diversity-aware sampling. We find that even when a large
portion of filtered vocabulary is randomized, it can actually generate fluent
poetry but with notably higher novelty. We also propose a
semantic-similarity-based rejection sampling algorithm, which creates longer
and more informative context on the basis of the short input poetry title while
maintaining high semantic similarity to the title, alleviating the off-topic
issue.
Related papers
- Evaluating Diversity in Automatic Poetry Generation [25.53206868552533]
We evaluate the diversity of automatically generated poetry along structural, lexical, semantic and stylistic dimensions.
We find that current automatic poetry systems are considerably underdiverse along multiple dimensions.
Our identified limitations may serve as the basis for more genuinely diverse future poetry generation models.
arXiv Detail & Related papers (2024-06-21T16:03:21Z) - On the Efficacy of Sampling Adapters [82.5941326570812]
We propose a unified framework for understanding sampling adapters.
We argue that the shift they enforce can be viewed as a trade-off between precision and recall.
We find that several precision-emphasizing measures indeed indicate that sampling adapters can lead to probability distributions more aligned with the true distribution.
arXiv Detail & Related papers (2023-07-07T17:59:12Z) - PoetryDiffusion: Towards Joint Semantic and Metrical Manipulation in
Poetry Generation [58.36105306993046]
Controllable text generation is a challenging and meaningful field in natural language generation (NLG)
In this paper, we pioneer the use of the Diffusion model for generating sonnets and Chinese SongCi poetry.
Our model outperforms existing models in automatic evaluation of semantic, metrical, and overall performance as well as human evaluation.
arXiv Detail & Related papers (2023-06-14T11:57:31Z) - Typical Decoding for Natural Language Generation [76.69397802617064]
We study why high-probability texts can be dull or repetitive.
We show that typical sampling offers competitive performance in terms of quality.
arXiv Detail & Related papers (2022-02-01T18:58:45Z) - A pattern recognition approach for distinguishing between prose and
poetry [0.8971132850029492]
We propose an automated method to distinguish between poetry and prose based solely on aural and rhythmic properties.
The classification of the considered texts using the set of features extracted resulted in a best accuracy of 0.78, obtained with a neural network.
arXiv Detail & Related papers (2021-07-18T18:44:17Z) - CCPM: A Chinese Classical Poetry Matching Dataset [50.90794811956129]
We propose a novel task to assess a model's semantic understanding of poetry by poem matching.
This task requires the model to select one line of Chinese classical poetry among four candidates according to the modern Chinese translation of a line of poetry.
To construct this dataset, we first obtain a set of parallel data of Chinese classical poetry and modern Chinese translation.
arXiv Detail & Related papers (2021-06-03T16:49:03Z) - Improving Diversity of Neural Text Generation via Inverse Probability
Weighting [43.36560720793425]
We propose a sampling method inspired by inverse probability weighting.
We show might contain tedious or even repetitive candidates with high probability that lead to repetition loops.
Results show that our algorithm can effectively increase the diversity of generated samples while achieving close resemblance to human text.
arXiv Detail & Related papers (2021-03-13T08:17:40Z) - There Once Was a Really Bad Poet, It Was Automated but You Didn't Know
It [31.951441780084345]
We introduce LimGen, a novel and fully automated system for limerick generation.
LimGen outperforms state-of-the-art neural network-based poetry models.
The resulting limericks satisfy poetic constraints and have thematically coherent storylines.
arXiv Detail & Related papers (2021-03-05T16:03:55Z) - Compose Like Humans: Jointly Improving the Coherence and Novelty for
Modern Chinese Poetry Generation [13.709648635080828]
We propose a generate-retrieve-then-refine paradigm to jointly improve the coherence and novelty.
Experimental results on a collected large-scale modern Chinese poetry dataset show that our proposed approach can not only generate more coherent poems, but also improve the diversity and novelty.
arXiv Detail & Related papers (2020-05-04T15:16:10Z) - MixPoet: Diverse Poetry Generation via Learning Controllable Mixed
Latent Space [79.70053419040902]
We propose MixPoet, a novel model that absorbs multiple factors to create various styles and promote diversity.
Based on a semi-supervised variational autoencoder, our model disentangles the latent space into some subspaces, with each conditioned on one influence factor by adversarial training.
Experiment results on Chinese poetry demonstrate that MixPoet improves both diversity and quality against three state-of-the-art models.
arXiv Detail & Related papers (2020-03-13T03:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.