A Temporal Variational Model for Story Generation
- URL: http://arxiv.org/abs/2109.06807v1
- Date: Tue, 14 Sep 2021 16:36:12 GMT
- Title: A Temporal Variational Model for Story Generation
- Authors: David Wilmot, Frank Keller
- Abstract summary: Recent language models can generate interesting and grammatically correct text in story generation but often lack plot development and long-term coherence.
This paper experiments with a latent vector planning approach based on a TD-VAE (Temporal Difference Variational Autoencoder)
The results demonstrate strong performance in automatic cloze and swapping evaluations.
- Score: 21.99104738567138
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent language models can generate interesting and grammatically correct
text in story generation but often lack plot development and long-term
coherence. This paper experiments with a latent vector planning approach based
on a TD-VAE (Temporal Difference Variational Autoencoder), using the model for
conditioning and reranking for text generation. The results demonstrate strong
performance in automatic cloze and swapping evaluations. The human judgments
show stories generated with TD-VAE reranking improve on a GPT-2 medium baseline
and show comparable performance to a hierarchical LSTM reranking model.
Conditioning on the latent vectors proves disappointing and deteriorates
performance in human evaluation because it reduces the diversity of generation,
and the models don't learn to progress the narrative. This highlights an
important difference between technical task performance (e.g. cloze) and
generating interesting stories.
Related papers
- Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens [53.99177152562075]
Scaling up autoregressive models in vision has not proven as beneficial as in large language models.
We focus on two critical factors: whether models use discrete or continuous tokens, and whether tokens are generated in a random or fixed order using BERT- or GPT-like transformer architectures.
Our results show that while all models scale effectively in terms of validation loss, their evaluation performance -- measured by FID, GenEval score, and visual quality -- follows different trends.
arXiv Detail & Related papers (2024-10-17T17:59:59Z) - A Cross-Attention Augmented Model for Event-Triggered Context-Aware
Story Generation [28.046803293933213]
We introduce a novel neural generation model, EtriCA, that enhances the relevance and coherence of generated stories.
We employ a post-training framework for knowledge enhancement (KeEtriCA) on a large-scale book corpus.
This results in approximately 5% improvement in automatic metrics and over 10% improvement in human evaluation.
arXiv Detail & Related papers (2023-11-19T08:54:47Z) - Extensive Evaluation of Transformer-based Architectures for Adverse Drug
Events Extraction [6.78974856327994]
Adverse Event (ADE) extraction is one of the core tasks in digital pharmacovigilance.
We evaluate 19 Transformer-based models for ADE extraction on informal texts.
At the end of our analyses, we identify a list of take-home messages that can be derived from the experimental data.
arXiv Detail & Related papers (2023-06-08T15:25:24Z) - DeltaScore: Fine-Grained Story Evaluation with Perturbations [69.33536214124878]
We introduce DELTASCORE, a novel methodology that employs perturbation techniques for the evaluation of nuanced story aspects.
Our central proposition posits that the extent to which a story excels in a specific aspect (e.g., fluency) correlates with the magnitude of its susceptibility to particular perturbations.
We measure the quality of an aspect by calculating the likelihood difference between pre- and post-perturbation states using pre-trained language models.
arXiv Detail & Related papers (2023-03-15T23:45:54Z) - The Next Chapter: A Study of Large Language Models in Storytelling [51.338324023617034]
The application of prompt-based learning with large language models (LLMs) has exhibited remarkable performance in diverse natural language processing (NLP) tasks.
This paper conducts a comprehensive investigation, utilizing both automatic and human evaluation, to compare the story generation capacity of LLMs with recent models.
The results demonstrate that LLMs generate stories of significantly higher quality compared to other story generation models.
arXiv Detail & Related papers (2023-01-24T02:44:02Z) - Few-shot Text Classification with Dual Contrastive Consistency [31.141350717029358]
In this paper, we explore how to utilize pre-trained language model to perform few-shot text classification.
We adopt supervised contrastive learning on few labeled data and consistency-regularization on vast unlabeled data.
arXiv Detail & Related papers (2022-09-29T19:26:23Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - Discrete Auto-regressive Variational Attention Models for Text Modeling [53.38382932162732]
Variational autoencoders (VAEs) have been widely applied for text modeling.
They are troubled by two challenges: information underrepresentation and posterior collapse.
We propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges.
arXiv Detail & Related papers (2021-06-16T06:36:26Z) - Narrative Text Generation with a Latent Discrete Plan [39.71663365273463]
We propose a deep latent variable model that first samples a sequence of anchor words, one per sentence in the story, as part of its generative process.
During training, our model treats the sequence of anchor words as a latent variable and attempts to induce anchoring sequences that help guide generation in an unsupervised fashion.
We conduct human evaluations which demonstrate that the stories produced by our model are rated better in comparison with baselines which do not consider story plans.
arXiv Detail & Related papers (2020-10-07T08:45:37Z) - Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization.
We also propose a prototype encoding structure to model the ability of intra-topic derivation.
Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.