Sonnet or Not, Bot? Poetry Evaluation for Large Models and Datasets
- URL: http://arxiv.org/abs/2406.18906v1
- Date: Thu, 27 Jun 2024 05:36:53 GMT
- Title: Sonnet or Not, Bot? Poetry Evaluation for Large Models and Datasets
- Authors: Melanie Walsh, Anna Preus, Maria Antoniak,
- Abstract summary: We evaluate how well large language models (LLMs) recognize a specific aspect of poetry, poetic form, for more than 20 forms and formal elements in the English language.
Our findings have implications for NLP researchers interested in model evaluation, digital humanities and cultural analytics scholars, and cultural heritage professionals.
- Score: 3.0040661953201475
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Large language models (LLMs) can now generate and recognize text in a wide range of styles and genres, including highly specialized, creative genres like poetry. But what do LLMs really know about poetry? What can they know about poetry? We develop a task to evaluate how well LLMs recognize a specific aspect of poetry, poetic form, for more than 20 forms and formal elements in the English language. Poetic form captures many different poetic features, including rhyme scheme, meter, and word or line repetition. We use this task to reflect on LLMs' current poetic capabilities, as well as the challenges and pitfalls of creating NLP benchmarks for poetry and for other creative tasks. In particular, we use this task to audit and reflect on the poems included in popular pretraining datasets. Our findings have implications for NLP researchers interested in model evaluation, digital humanities and cultural analytics scholars, and cultural heritage professionals.
Related papers
- LFED: A Literary Fiction Evaluation Dataset for Large Language Models [58.85989777743013]
We collect 95 literary fictions that are either originally written in Chinese or translated into Chinese, covering a wide range of topics across several centuries.
We define a question taxonomy with 8 question categories to guide the creation of 1,304 questions.
We conduct an in-depth analysis to ascertain how specific attributes of literary fictions (e.g., novel types, character numbers, the year of publication) impact LLM performance in evaluations.
arXiv Detail & Related papers (2024-05-16T15:02:24Z) - A Computational Approach to Style in American Poetry [19.41186389974801]
We develop a method to assess the style of American poems and to visualize a collection of poems in relation to one another.
qualitative poetry criticism helped guide our development of metrics that analyze various orthographic, syntactic, and phonemic features.
Our method has potential applications to academic research of texts, to research of the intuitive personal response to poetry, and to making recommendations to readers based on their favorite poems.
arXiv Detail & Related papers (2023-10-13T18:49:14Z) - PoetryDiffusion: Towards Joint Semantic and Metrical Manipulation in
Poetry Generation [58.36105306993046]
Controllable text generation is a challenging and meaningful field in natural language generation (NLG)
In this paper, we pioneer the use of the Diffusion model for generating sonnets and Chinese SongCi poetry.
Our model outperforms existing models in automatic evaluation of semantic, metrical, and overall performance as well as human evaluation.
arXiv Detail & Related papers (2023-06-14T11:57:31Z) - Generation of Chinese classical poetry based on pre-trained model [1.6114012813668934]
This paper mainly tries to use BART and other pre training models to generate metrical poetry text.
It developed a set of AI poetry Turing problems, it was reviewed by a group of poets and poetry writing researchers.
The model of poetry generation studied by the author generalizes works that cannot be distinguished from those of advanced scholars.
arXiv Detail & Related papers (2022-11-04T16:05:31Z) - PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised
Poetry Generation [42.12348554537587]
Formal verse poetry imposes strict constraints on the meter and rhyme scheme of poems.
Most prior work on generating this type of poetry uses existing poems for supervision.
We propose an unsupervised approach to generate poems following any given meter and rhyme scheme.
arXiv Detail & Related papers (2022-05-24T17:09:55Z) - Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship
Attribution [74.27826764855911]
We employ syllabic quantity as a base for deriving rhythmic features for the task of computational authorship attribution of Latin prose texts.
Our experiments, carried out on three different datasets, using two different machine learning methods, show that rhythmic features based on syllabic quantity are beneficial in discriminating among Latin prose authors.
arXiv Detail & Related papers (2021-10-27T06:25:31Z) - CCPM: A Chinese Classical Poetry Matching Dataset [50.90794811956129]
We propose a novel task to assess a model's semantic understanding of poetry by poem matching.
This task requires the model to select one line of Chinese classical poetry among four candidates according to the modern Chinese translation of a line of poetry.
To construct this dataset, we first obtain a set of parallel data of Chinese classical poetry and modern Chinese translation.
arXiv Detail & Related papers (2021-06-03T16:49:03Z) - Acrostic Poem Generation [26.604889384391726]
We propose a new task in the area of computational creativity: acrostic poem generation in English.
Acrostic poems are poems that contain a hidden message; typically, the first letter of each line spells out a word or short phrase.
Our experiments show that the acrostic poems generated by our baseline are received well by humans and do not lose much quality due to the additional constraints.
arXiv Detail & Related papers (2020-10-05T18:00:15Z) - MixPoet: Diverse Poetry Generation via Learning Controllable Mixed
Latent Space [79.70053419040902]
We propose MixPoet, a novel model that absorbs multiple factors to create various styles and promote diversity.
Based on a semi-supervised variational autoencoder, our model disentangles the latent space into some subspaces, with each conditioned on one influence factor by adversarial training.
Experiment results on Chinese poetry demonstrate that MixPoet improves both diversity and quality against three state-of-the-art models.
arXiv Detail & Related papers (2020-03-13T03:31:29Z) - Introducing Aspects of Creativity in Automatic Poetry Generation [2.792030485253753]
Poetry Generation involves teaching systems to automatically generate text that resembles poetic work.
A deep learning system can learn to generate poetry on its own by training on a corpus of poems and modeling the particular style of language.
We propose taking an approach that fine-tunes GPT-2, a pre-trained language model, to our downstream task of poetry generation.
arXiv Detail & Related papers (2020-02-06T20:44:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.