Automated Evaluation of Meter and Rhyme in Russian Generative and Human-Authored Poetry
- URL: http://arxiv.org/abs/2502.20931v1
- Date: Fri, 28 Feb 2025 10:39:07 GMT
- Title: Automated Evaluation of Meter and Rhyme in Russian Generative and Human-Authored Poetry
- Authors: Ilya Koziev,
- Abstract summary: We introduce the Russian Poetry Scansion Tool library for stress mark placement in Russian-language poetry.<n>We release RIFMA -- a dataset of poem fragments spanning various genres and forms, annotated with stress marks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Generative poetry systems require effective tools for data engineering and automatic evaluation, particularly to assess how well a poem adheres to versification rules, such as the correct alternation of stressed and unstressed syllables and the presence of rhymes. In this work, we introduce the Russian Poetry Scansion Tool library designed for stress mark placement in Russian-language syllabo-tonic poetry, rhyme detection, and identification of defects of poeticness. Additionally, we release RIFMA -- a dataset of poem fragments spanning various genres and forms, annotated with stress marks. This dataset can be used to evaluate the capability of modern large language models to accurately place stress marks in poetic texts. The published resources provide valuable tools for researchers and practitioners in the field of creative generative AI, facilitating advancements in the development and evaluation of generative poetry systems.
Related papers
- Sonnet or Not, Bot? Poetry Evaluation for Large Models and Datasets [3.0040661953201475]
Large language models (LLMs) can now generate and recognize poetry.
We develop a task to evaluate how well LLMs recognize one aspect of English-language poetry.
We show that state-of-the-art LLMs can successfully identify both common and uncommon fixed poetic forms.
arXiv Detail & Related papers (2024-06-27T05:36:53Z) - GPT Czech Poet: Generation of Czech Poetic Strophes with Language Models [0.4444634303550442]
We introduce a new model for generating poetry in Czech language, based on fine-tuning a pre-trained Large Language Model.
We demonstrate that guiding the generation process by explicitly specifying strophe parameters within the poem text strongly improves the effectiveness of the model.
arXiv Detail & Related papers (2024-06-18T06:19:45Z) - Erato: Automatizing Poetry Evaluation [6.5990719141691825]
We present Erato, a framework designed to facilitate the automated evaluation of poetry.
Using Erato, we compare and contrast human-authored poetry with automatically-generated poetry.
arXiv Detail & Related papers (2023-10-31T10:06:37Z) - A Computational Approach to Style in American Poetry [19.41186389974801]
We develop a method to assess the style of American poems and to visualize a collection of poems in relation to one another.
qualitative poetry criticism helped guide our development of metrics that analyze various orthographic, syntactic, and phonemic features.
Our method has potential applications to academic research of texts, to research of the intuitive personal response to poetry, and to making recommendations to readers based on their favorite poems.
arXiv Detail & Related papers (2023-10-13T18:49:14Z) - Boosting Punctuation Restoration with Data Generation and Reinforcement
Learning [70.26450819702728]
Punctuation restoration is an important task in automatic speech recognition (ASR)
The discrepancy between written punctuated texts and ASR texts limits the usability of written texts in training punctuation restoration systems for ASR texts.
This paper proposes a reinforcement learning method to exploit in-topic written texts and recent advances in large pre-trained generative language models to bridge this gap.
arXiv Detail & Related papers (2023-07-24T17:22:04Z) - PoetryDiffusion: Towards Joint Semantic and Metrical Manipulation in
Poetry Generation [58.36105306993046]
Controllable text generation is a challenging and meaningful field in natural language generation (NLG)
In this paper, we pioneer the use of the Diffusion model for generating sonnets and Chinese SongCi poetry.
Our model outperforms existing models in automatic evaluation of semantic, metrical, and overall performance as well as human evaluation.
arXiv Detail & Related papers (2023-06-14T11:57:31Z) - PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised
Poetry Generation [42.12348554537587]
Formal verse poetry imposes strict constraints on the meter and rhyme scheme of poems.
Most prior work on generating this type of poetry uses existing poems for supervision.
We propose an unsupervised approach to generate poems following any given meter and rhyme scheme.
arXiv Detail & Related papers (2022-05-24T17:09:55Z) - Digital Editions as Distant Supervision for Layout Analysis of Printed
Books [76.29918490722902]
We describe methods for exploiting this semantic markup as distant supervision for training and evaluating layout analysis models.
In experiments with several model architectures on the half-million pages of the Deutsches Textarchiv (DTA), we find a high correlation of these region-level evaluation methods with pixel-level and word-level metrics.
We discuss the possibilities for improving accuracy with self-training and the ability of models trained on the DTA to generalize to other historical printed books.
arXiv Detail & Related papers (2021-12-23T16:51:53Z) - CCPM: A Chinese Classical Poetry Matching Dataset [50.90794811956129]
We propose a novel task to assess a model's semantic understanding of poetry by poem matching.
This task requires the model to select one line of Chinese classical poetry among four candidates according to the modern Chinese translation of a line of poetry.
To construct this dataset, we first obtain a set of parallel data of Chinese classical poetry and modern Chinese translation.
arXiv Detail & Related papers (2021-06-03T16:49:03Z) - Metrical Tagging in the Wild: Building and Annotating Poetry Corpora
with Rhythmic Features [0.0]
We provide large poetry corpora for English and German, and annotate prosodic features in smaller corpora to train corpus driven neural models.
We show that BiLSTM-CRF models with syllable embeddings outperform a CRF baseline and different BERT-based approaches.
arXiv Detail & Related papers (2021-02-17T16:38:57Z) - Generating Major Types of Chinese Classical Poetry in a Uniformed
Framework [88.57587722069239]
We propose a GPT-2 based framework for generating major types of Chinese classical poems.
Preliminary results show this enhanced model can generate Chinese classical poems of major types with high quality in both form and content.
arXiv Detail & Related papers (2020-03-13T14:16:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.