LongStory: Coherent, Complete and Length Controlled Long story
Generation
- URL: http://arxiv.org/abs/2311.15208v1
- Date: Sun, 26 Nov 2023 06:24:25 GMT
- Title: LongStory: Coherent, Complete and Length Controlled Long story
Generation
- Authors: Kyeongman Park, Nakyeong Yang, Kyomin Jung
- Abstract summary: We present the LongStory for coherent, complete, and length-controlled long story generation.
LongStory introduces two novel methodologies: (1) the long and short-term contexts weight calibrator (CWC) and (2) long story structural positions (LSP).
- Score: 18.886499970698285
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A human author can write any length of story without losing coherence. Also,
they always bring the story to a proper ending, an ability that current
language models lack. In this work, we present the LongStory for coherent,
complete, and length-controlled long story generation. LongStory introduces two
novel methodologies: (1) the long and short-term contexts weight calibrator
(CWC) and (2) long story structural positions (LSP). The CWC adjusts weights
for long-term context Memory and short-term context Cheating, acknowledging
their distinct roles. The LSP employs discourse tokens to convey the structural
positions of a long story. Trained on three datasets with varied average story
lengths, LongStory outperforms other baselines, including the strong story
generator Plotmachine, in coherence, completeness, relevance, and
repetitiveness. We also perform zero-shot tests on each dataset to assess the
model's ability to predict outcomes beyond its training data and validate our
methodology by comparing its performance with variants of our model.
Related papers
- How to Train Long-Context Language Models (Effectively) [75.5418485597276]
We study continued training and supervised fine-tuning (SFT) of a language model (LM) to make effective use of long-context information.
ProLong-8B, which is from Llama-3 and trained on 40B tokens, demonstrates state-of-the-art long-context performance among similarly sized models at a length of 128K.
arXiv Detail & Related papers (2024-10-03T16:46:52Z) - Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA [71.04146366608904]
Long-context modeling capabilities have garnered widespread attention, leading to the emergence of Large Language Models (LLMs) with ultra-context windows.
We propose a novel long-context benchmark, Loong, aligning with realistic scenarios through extended multi-document question answering (QA)
Loong introduces four types of tasks with a range of context lengths: Spotlight Locating, Comparison, Clustering, and Chain of Reasoning.
arXiv Detail & Related papers (2024-06-25T09:42:56Z) - Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models [13.091271774417867]
Long-context modeling capabilities are important for large language models (LLMs) in various applications.
We propose a data mining framework textbfProLong that can assign each training sample with a long dependency score.
Comprehensive experiments on multiple benchmarks indicate that ProLong effectively identifies documents that carry long dependencies.
arXiv Detail & Related papers (2024-05-28T07:36:56Z) - LongAlign: A Recipe for Long Context Alignment of Large Language Models [61.85923382850057]
LongAlign is a recipe of the instruction data, training, and evaluation for long context alignment.
We construct a long instruction-following dataset using Self-Instruct.
We adopt the packing and sorted strategies to speed up supervised fine-tuning on data with varied length distributions.
arXiv Detail & Related papers (2024-01-31T18:29:39Z) - Improving Pacing in Long-Form Story Planning [55.39443681232538]
We propose a CONCrete Outline ConTrol system to improve pacing when automatically generating story outlines.
We first train a concreteness evaluator to judge which of two events is more concrete.
In this work, we explore a vaguest-first expansion procedure that aims for uniform pacing.
arXiv Detail & Related papers (2023-11-08T04:58:29Z) - LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding [58.20031627237889]
LongBench is the first bilingual, multi-task benchmark for long context understanding.
It comprises 21 datasets across 6 task categories in both English and Chinese, with an average length of 6,711 words (English) and 13,386 characters (Chinese)
arXiv Detail & Related papers (2023-08-28T11:53:40Z) - Adapting Pretrained Text-to-Text Models for Long Text Sequences [39.62224414485055]
We adapt an existing pretrained text-to-text model for long-sequence inputs.
We build a long-context model that achieves competitive performance on long-text QA tasks.
arXiv Detail & Related papers (2022-09-21T00:41:07Z) - LOT: A Benchmark for Evaluating Chinese Long Text Understanding and
Generation [49.57366550980932]
Long text modeling requires many capabilities such as modeling long-range commonsense and discourse relations.
We propose LOT, a benchmark including two understanding and two generation tasks for Chinese long text modeling evaluation.
We release an encoder-decoder Chinese long text pretraining model named LongLM with up to 1 billion parameters.
arXiv Detail & Related papers (2021-08-30T02:38:32Z) - Consistency and Coherency Enhanced Story Generation [35.08911595854691]
We propose a two-stage generation framework to enhance consistency and coherency of generated stories.
The first stage is to organize the story outline which depicts the story plots and events, and the second stage is to expand the outline into a complete story.
In addition, coreference supervision signals are incorporated to reduce coreference errors and improve the coreference consistency.
arXiv Detail & Related papers (2020-10-17T16:40:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.