STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story
Generation
- URL: http://arxiv.org/abs/2010.01717v1
- Date: Sun, 4 Oct 2020 23:26:09 GMT
- Title: STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story
Generation
- Authors: Nader Akoury, Shufan Wang, Josh Whiting, Stephen Hood, Nanyun Peng,
Mohit Iyyer
- Abstract summary: We introduce a dataset and evaluation platform built from STORIUM, an online collaborative storytelling community.
Our dataset contains 6K lengthy stories with fine-grained natural language annotations interspersed throughout each narrative.
We evaluate language models fine-tuned on our dataset by integrating them onto STORIUM, where real authors can query a model for suggested story continuations and then edit them.
- Score: 48.56586847883825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Systems for story generation are asked to produce plausible and enjoyable
stories given an input context. This task is underspecified, as a vast number
of diverse stories can originate from a single input. The large output space
makes it difficult to build and evaluate story generation models, as (1)
existing datasets lack rich enough contexts to meaningfully guide models, and
(2) existing evaluations (both crowdsourced and automatic) are unreliable for
assessing long-form creative text. To address these issues, we introduce a
dataset and evaluation platform built from STORIUM, an online collaborative
storytelling community. Our author-generated dataset contains 6K lengthy
stories (125M tokens) with fine-grained natural language annotations (e.g.,
character goals and attributes) interspersed throughout each narrative, forming
a robust source for guiding models. We evaluate language models fine-tuned on
our dataset by integrating them onto STORIUM, where real authors can query a
model for suggested story continuations and then edit them. Automatic metrics
computed over these edits correlate well with both user ratings of generated
stories and qualitative feedback from semi-structured user interviews. We
release both the STORIUM dataset and evaluation platform to spur more
principled research into story generation.
Related papers
- DataNarrative: Automated Data-Driven Storytelling with Visualizations and Texts [27.218934418961197]
We introduce a novel task for data story generation and a benchmark containing 1,449 stories from diverse sources.
To address the challenges of crafting coherent data stories, we propose a multiagent framework employing two LLM agents.
While our agentic framework generally outperforms non-agentic counterparts in both model-based and human evaluations, the results also reveal unique challenges in data story generation.
arXiv Detail & Related papers (2024-08-09T21:31:33Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - Triples-to-isiXhosa (T2X): Addressing the Challenges of Low-Resource
Agglutinative Data-to-Text Generation [9.80836683456026]
We tackle data-to-text for isiXhosa, which is low-resource and agglutinative.
We introduce Triples-to-isiXhosa (T2X), a new dataset based on a subset of WebNLG.
We develop an evaluation framework for T2X that measures how accurately generated text describes the data.
arXiv Detail & Related papers (2024-03-12T11:53:27Z) - Robust Preference Learning for Storytelling via Contrastive
Reinforcement Learning [53.92465205531759]
Controlled automated story generation seeks to generate natural language stories satisfying constraints from natural language critiques or preferences.
We train a contrastive bi-encoder model to align stories with human critiques, building a general purpose preference model.
We further fine-tune the contrastive reward model using a prompt-learning technique to increase story generation robustness.
arXiv Detail & Related papers (2022-10-14T13:21:33Z) - Unsupervised Neural Stylistic Text Generation using Transfer learning
and Adapters [66.17039929803933]
We propose a novel transfer learning framework which updates only $0.3%$ of model parameters to learn style specific attributes for response generation.
We learn style specific attributes from the PERSONALITY-CAPTIONS dataset.
arXiv Detail & Related papers (2022-10-07T00:09:22Z) - SummScreen: A Dataset for Abstractive Screenplay Summarization [52.56760815805357]
SummScreen is a dataset comprised of pairs of TV series transcripts and human written recaps.
Plot details are often expressed indirectly in character dialogues and may be scattered across the entirety of the transcript.
Since characters are fundamental to TV series, we also propose two entity-centric evaluation metrics.
arXiv Detail & Related papers (2021-04-14T19:37:40Z) - Outline to Story: Fine-grained Controllable Story Generation from
Cascaded Events [39.577220559911055]
We propose a new task named "Outline to Story" (O2S) as a test bed for fine-grained controllable generation of long text.
We then create datasets for future benchmarks, built by state-of-the-art keyword extraction techniques.
arXiv Detail & Related papers (2021-01-04T08:16:21Z) - Cue Me In: Content-Inducing Approaches to Interactive Story Generation [74.09575609958743]
We focus on the task of interactive story generation, where the user provides the model mid-level sentence abstractions.
We present two content-inducing approaches to effectively incorporate this additional information.
Experimental results from both automatic and human evaluations show that these methods produce more topically coherent and personalized stories.
arXiv Detail & Related papers (2020-10-20T00:36:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.