Evaluating Large Language Model Creativity from a Literary Perspective
- URL: http://arxiv.org/abs/2312.03746v1
- Date: Thu, 30 Nov 2023 16:46:25 GMT
- Title: Evaluating Large Language Model Creativity from a Literary Perspective
- Authors: Murray Shanahan and Catherine Clarke
- Abstract summary: This paper assesses the potential for large language models to serve as assistive tools in the creative writing process.
We develop interactive and multi-voice prompting strategies that interleave background descriptions, instructions that guide composition, samples of text in the target style, and critical discussion of the given samples.
- Score: 13.672268920902187
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper assesses the potential for large language models (LLMs) to serve
as assistive tools in the creative writing process, by means of a single,
in-depth case study. In the course of the study, we develop interactive and
multi-voice prompting strategies that interleave background descriptions (scene
setting, plot elements), instructions that guide composition, samples of text
in the target style, and critical discussion of the given samples. We
qualitatively evaluate the results from a literary critical perspective, as
well as from the standpoint of computational creativity (a sub-field of
artificial intelligence). Our findings lend support to the view that the
sophistication of the results that can be achieved with an LLM mirrors the
sophistication of the prompting.
Related papers
- Computational Modeling of Artistic Inspiration: A Framework for Predicting Aesthetic Preferences in Lyrical Lines Using Linguistic and Stylistic Features [8.205321096201095]
Artistic inspiration plays a crucial role in producing works that resonate deeply with audiences.
This work proposes a novel framework for computationally modeling artistic preferences in different individuals.
Our framework outperforms an out-of-the-box LLaMA-3-70b, a state-of-the-art open-source language model, by nearly 18 points.
arXiv Detail & Related papers (2024-10-03T18:10:16Z) - Explaining Multi-modal Large Language Models by Analyzing their Vision Perception [4.597864989500202]
This research proposes a novel approach to enhance the interpretability of MLLMs by focusing on the image embedding component.
We combine an open-world localization model with a MLLM, thus creating a new architecture able to simultaneously produce text and object localization outputs from the same vision embedding.
arXiv Detail & Related papers (2024-05-23T14:24:23Z) - Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models [8.920436030483872]
We propose Writing Path, a framework that uses explicit outlines to guide Large Language Models (LLMs) in generating user-aligned text.
Our approach draws inspiration from structured writing planning and reasoning paths, focusing on capturing and reflecting user intentions throughout the writing process.
arXiv Detail & Related papers (2024-04-22T06:57:43Z) - Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral.
This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z) - Can Large Language Models Understand Context? [17.196362853457412]
This paper introduces a context understanding benchmark by adapting existing datasets to suit the evaluation of generative models.
Experimental results indicate that pre-trained dense models struggle with understanding more nuanced contextual features when compared to state-of-the-art fine-tuned models.
As LLM compression holds growing significance in both research and real-world applications, we assess the context understanding of quantized models under in-context-learning settings.
arXiv Detail & Related papers (2024-02-01T18:55:29Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - Multi-Dimensional Evaluation of Text Summarization with In-Context
Learning [79.02280189976562]
In this paper, we study the efficacy of large language models as multi-dimensional evaluators using in-context learning.
Our experiments show that in-context learning-based evaluators are competitive with learned evaluation frameworks for the task of text summarization.
We then analyze the effects of factors such as the selection and number of in-context examples on performance.
arXiv Detail & Related papers (2023-06-01T23:27:49Z) - OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models [122.27878464009181]
We conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks.
OCRBench contains 29 datasets, making it the most comprehensive OCR evaluation benchmark available.
arXiv Detail & Related papers (2023-05-13T11:28:37Z) - Large Language Models are Diverse Role-Players for Summarization
Evaluation [82.31575622685902]
A document summary's quality can be assessed by human annotators on various criteria, both objective ones like grammar and correctness, and subjective ones like informativeness, succinctness, and appeal.
Most of the automatic evaluation methods like BLUE/ROUGE may be not able to adequately capture the above dimensions.
We propose a new evaluation framework based on LLMs, which provides a comprehensive evaluation framework by comparing generated text and reference text from both objective and subjective aspects.
arXiv Detail & Related papers (2023-03-27T10:40:59Z) - Object Relational Graph with Teacher-Recommended Learning for Video
Captioning [92.48299156867664]
We propose a complete video captioning system including both a novel model and an effective training strategy.
Specifically, we propose an object relational graph (ORG) based encoder, which captures more detailed interaction features to enrich visual representation.
Meanwhile, we design a teacher-recommended learning (TRL) method to make full use of the successful external language model (ELM) to integrate the abundant linguistic knowledge into the caption model.
arXiv Detail & Related papers (2020-02-26T15:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.