Evaluating Creative Short Story Generation in Humans and Large Language Models
- URL: http://arxiv.org/abs/2411.02316v2
- Date: Wed, 06 Nov 2024 23:27:24 GMT
- Title: Evaluating Creative Short Story Generation in Humans and Large Language Models
- Authors: Mete Ismayilzada, Claire Stevenson, Lonneke van der Plas,
- Abstract summary: Large language models (LLMs) have recently demonstrated the ability to generate high-quality stories.
We conduct a systematic analysis of creativity in short story generation across LLMs and everyday people.
Our findings reveal that while LLMs can generate stylistically complex stories, they tend to fall short in terms of creativity when compared to average human writers.
- Score: 0.7965327033045846
- License:
- Abstract: Storytelling is a fundamental aspect of human communication, relying heavily on creativity to produce narratives that are novel, appropriate, and surprising. While large language models (LLMs) have recently demonstrated the ability to generate high-quality stories, their creative capabilities remain underexplored. Previous research has either focused on creativity tests requiring short responses or primarily compared model performance in story generation to that of professional writers. However, the question of whether LLMs exhibit creativity in writing short stories on par with the average human remains unanswered. In this work, we conduct a systematic analysis of creativity in short story generation across LLMs and everyday people. Using a five-sentence creative story task, commonly employed in psychology to assess human creativity, we automatically evaluate model- and human-generated stories across several dimensions of creativity, including novelty, surprise, and diversity. Our findings reveal that while LLMs can generate stylistically complex stories, they tend to fall short in terms of creativity when compared to average human writers.
Related papers
- Agents' Room: Narrative Generation through Multi-step Collaboration [54.98886593802834]
We propose a generation framework inspired by narrative theory that decomposes narrative writing into subtasks tackled by specialized agents.
We show that Agents' Room generates stories preferred by expert evaluators over those produced by baseline systems.
arXiv Detail & Related papers (2024-10-03T15:44:42Z) - A Character-Centric Creative Story Generation via Imagination [15.345466372805516]
We introduce a novel story generation framework called CCI (Character-centric Creative story generation via Imagination)
CCI features two modules for creative story generation: IG (Image-Guided Imagination) and MW (Multi-Writer model)
In the IG module, we utilize a text-to-image model to create visual representations of key story elements, such as characters, backgrounds, and main plots.
The MW module uses these story elements to generate multiple persona-description candidates and selects the best one to insert into the story, thereby enhancing the richness and depth of the narrative.
arXiv Detail & Related papers (2024-09-25T06:54:29Z) - Small Language Models can Outperform Humans in Short Creative Writing: A Study Comparing SLMs with Humans and LLMs [0.9831489366502301]
We evaluate the creative fiction writing abilities of a fine-tuned small language model (SLM), BART Large, and compare its performance to humans and two large language models (LLMs): GPT-3.5 and GPT-4o.
arXiv Detail & Related papers (2024-09-17T20:40:02Z) - Are Large Language Models Capable of Generating Human-Level Narratives? [114.34140090869175]
This paper investigates the capability of LLMs in storytelling, focusing on narrative development and plot progression.
We introduce a novel computational framework to analyze narratives through three discourse-level aspects.
We show that explicit integration of discourse features can enhance storytelling, as is demonstrated by over 40% improvement in neural storytelling.
arXiv Detail & Related papers (2024-07-18T08:02:49Z) - Measuring Psychological Depth in Language Models [50.48914935872879]
We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that measures an LLM's ability to produce authentic and narratively complex stories.
We empirically validate our framework by showing that humans can consistently evaluate stories based on PDS (0.72 Krippendorff's alpha)
Surprisingly, GPT-4 stories either surpassed or were statistically indistinguishable from highly-rated human-written stories sourced from Reddit.
arXiv Detail & Related papers (2024-06-18T14:51:54Z) - Divergent Creativity in Humans and Large Language Models [37.67363469600804]
The recent surge in the capabilities of Large Language Models has led to claims that they are approaching a level of creativity akin to human capabilities.
We leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans.
arXiv Detail & Related papers (2024-05-13T22:37:52Z) - Art or Artifice? Large Language Models and the False Promise of
Creativity [53.04834589006685]
We propose the Torrance Test of Creative Writing (TTCW) to evaluate creativity as a product.
TTCW consists of 14 binary tests organized into the original dimensions of Fluency, Flexibility, Originality, and Elaboration.
Our analysis shows that LLM-generated stories pass 3-10X less TTCW tests than stories written by professionals.
arXiv Detail & Related papers (2023-09-25T22:02:46Z) - On the Creativity of Large Language Models [2.4555276449137042]
Large Language Models (LLMs) are revolutionizing several areas of Artificial Intelligence.
This article first analyzes the development of LLMs under the lens of creativity theories.
Then, we consider different classic perspectives, namely product, process, press, and person.
Finally, we examine the societal impact of these technologies with a particular focus on the creative industries.
arXiv Detail & Related papers (2023-03-27T18:00:01Z) - The Next Chapter: A Study of Large Language Models in Storytelling [51.338324023617034]
The application of prompt-based learning with large language models (LLMs) has exhibited remarkable performance in diverse natural language processing (NLP) tasks.
This paper conducts a comprehensive investigation, utilizing both automatic and human evaluation, to compare the story generation capacity of LLMs with recent models.
The results demonstrate that LLMs generate stories of significantly higher quality compared to other story generation models.
arXiv Detail & Related papers (2023-01-24T02:44:02Z) - Computational Storytelling and Emotions: A Survey [56.95572957863576]
This survey paper is intended to summarize and contribute to the development of research being conducted on the relationship between stories and emotions.
We believe creativity research is not to replace humans with computers, but to find a way of collaboration between humans and computers to enhance the creativity.
arXiv Detail & Related papers (2022-05-23T00:21:59Z) - Collaborative Storytelling with Large-scale Neural Language Models [6.0794985566317425]
We introduce the task of collaborative storytelling, where an artificial intelligence agent and a person collaborate to create a unique story by taking turns adding to it.
We present a collaborative storytelling system which works with a human storyteller to create a story by generating new utterances based on the story so far.
arXiv Detail & Related papers (2020-11-20T04:36:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.