Understanding Literary Texts by LLMs: A Case Study of Ancient Chinese Poetry
- URL: http://arxiv.org/abs/2409.00060v2
- Date: Wed, 11 Sep 2024 07:51:43 GMT
- Title: Understanding Literary Texts by LLMs: A Case Study of Ancient Chinese Poetry
- Authors: Cheng Zhao, Bin Wang, Zhen Wang,
- Abstract summary: In genres such as poetry, jokes, and short stories, numerous AI tools have emerged, offering refreshing new perspectives.
evaluating literary works is often complex and hard to fully quantify, which directly hinders the further development of AI creation.
This paper attempts to explore the mysteries of literary texts from the perspective of large language models.
- Score: 9.970908656435066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The birth and rapid development of large language models (LLMs) have caused quite a stir in the field of literature. Once considered unattainable, AI's role in literary creation is increasingly becoming a reality. In genres such as poetry, jokes, and short stories, numerous AI tools have emerged, offering refreshing new perspectives. However, it's difficult to further improve the quality of these works. This is primarily because understanding and appreciating a good literary work involves a considerable threshold, such as knowledge of literary theory, aesthetic sensibility, interdisciplinary knowledge. Therefore, authoritative data in this area is quite lacking. Additionally, evaluating literary works is often complex and hard to fully quantify, which directly hinders the further development of AI creation. To address this issue, this paper attempts to explore the mysteries of literary texts from the perspective of LLMs, using ancient Chinese poetry as an example for experimentation. First, we collected a variety of ancient poems from different sources and had experts annotate a small portion of them. Then, we designed a range of comprehension metrics based on LLMs to evaluate all these poems. Finally, we analyzed the correlations and differences between various poem collections to identify literary patterns. Through our experiments, we observed a series of enlightening phenomena that provide technical support for the future development of high-level literary creation based on LLMs.
Related papers
- A Perspective on Literary Metaphor in the Context of Generative AI [0.6445605125467572]
This study explores the role of literary metaphor and its capacity to generate a range of meanings.
To investigate whether the inclusion of original figurative language improves textual quality, we trained an LSTM-based language model in Afrikaans.
The paper raises thought-provoking questions on aesthetic value, interpretation and evaluation.
arXiv Detail & Related papers (2024-09-02T08:27:29Z) - Are Large Language Models Capable of Generating Human-Level Narratives? [114.34140090869175]
This paper investigates the capability of LLMs in storytelling, focusing on narrative development and plot progression.
We introduce a novel computational framework to analyze narratives through three discourse-level aspects.
We show that explicit integration of discourse features can enhance storytelling, as is demonstrated by over 40% improvement in neural storytelling.
arXiv Detail & Related papers (2024-07-18T08:02:49Z) - Sonnet or Not, Bot? Poetry Evaluation for Large Models and Datasets [3.0040661953201475]
Large language models (LLMs) can now generate and recognize poetry.
We develop a task to evaluate how well LLMs recognize one aspect of English-language poetry.
We show that state-of-the-art LLMs can successfully identify both common and uncommon fixed poetic forms.
arXiv Detail & Related papers (2024-06-27T05:36:53Z) - (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts [52.18246881218829]
We introduce a novel multi-agent framework based on large language models (LLMs) for literary translation, implemented as a company called TransAgents.
To evaluate the effectiveness of our system, we propose two innovative evaluation strategies: Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP)
arXiv Detail & Related papers (2024-05-20T05:55:08Z) - LFED: A Literary Fiction Evaluation Dataset for Large Language Models [58.85989777743013]
We collect 95 literary fictions that are either originally written in Chinese or translated into Chinese, covering a wide range of topics across several centuries.
We define a question taxonomy with 8 question categories to guide the creation of 1,304 questions.
We conduct an in-depth analysis to ascertain how specific attributes of literary fictions (e.g., novel types, character numbers, the year of publication) impact LLM performance in evaluations.
arXiv Detail & Related papers (2024-05-16T15:02:24Z) - Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models [52.24001776263608]
This comprehensive survey delves into the recent strides in HS moderation.
We highlight the burgeoning role of large language models (LLMs) and large multimodal models (LMMs)
We identify existing gaps in research, particularly in the context of underrepresented languages and cultures.
arXiv Detail & Related papers (2024-01-30T03:51:44Z) - A Computational Approach to Style in American Poetry [19.41186389974801]
We develop a method to assess the style of American poems and to visualize a collection of poems in relation to one another.
qualitative poetry criticism helped guide our development of metrics that analyze various orthographic, syntactic, and phonemic features.
Our method has potential applications to academic research of texts, to research of the intuitive personal response to poetry, and to making recommendations to readers based on their favorite poems.
arXiv Detail & Related papers (2023-10-13T18:49:14Z) - PoeticTTS -- Controllable Poetry Reading for Literary Studies [21.29478270833139]
We resynthesise poems by cloning prosodic values from a human reference recitation, and afterwards make use of fine-grained prosody control to manipulate the synthetic speech.
We find that finetuning our TTS model on poetry captures poetic intonation patterns to a large extent which is beneficial for prosody cloning and manipulation.
arXiv Detail & Related papers (2022-07-11T13:15:27Z) - RELIC: Retrieving Evidence for Literary Claims [29.762552250403544]
We use a large-scale dataset of 78K literary quotations to formulate the novel task of literary evidence retrieval.
We implement a RoBERTa-based dense passage retriever for this task that outperforms existing pretrained information retrieval baselines.
arXiv Detail & Related papers (2022-03-18T16:56:08Z) - CCPM: A Chinese Classical Poetry Matching Dataset [50.90794811956129]
We propose a novel task to assess a model's semantic understanding of poetry by poem matching.
This task requires the model to select one line of Chinese classical poetry among four candidates according to the modern Chinese translation of a line of poetry.
To construct this dataset, we first obtain a set of parallel data of Chinese classical poetry and modern Chinese translation.
arXiv Detail & Related papers (2021-06-03T16:49:03Z) - MixPoet: Diverse Poetry Generation via Learning Controllable Mixed
Latent Space [79.70053419040902]
We propose MixPoet, a novel model that absorbs multiple factors to create various styles and promote diversity.
Based on a semi-supervised variational autoencoder, our model disentangles the latent space into some subspaces, with each conditioned on one influence factor by adversarial training.
Experiment results on Chinese poetry demonstrate that MixPoet improves both diversity and quality against three state-of-the-art models.
arXiv Detail & Related papers (2020-03-13T03:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.