Toward Better Storylines with Sentence-Level Language Models
- URL: http://arxiv.org/abs/2005.05255v1
- Date: Mon, 11 May 2020 16:54:19 GMT
- Title: Toward Better Storylines with Sentence-Level Language Models
- Authors: Daphne Ippolito, David Grangier, Douglas Eck, Chris Callison-Burch
- Abstract summary: We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.
We demonstrate the effectiveness of our approach with state-of-the-art accuracy on the unsupervised Story Cloze task.
- Score: 54.91921545103256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a sentence-level language model which selects the next sentence in
a story from a finite set of fluent alternatives. Since it does not need to
model fluency, the sentence-level language model can focus on longer range
dependencies, which are crucial for multi-sentence coherence. Rather than
dealing with individual words, our method treats the story so far as a list of
pre-trained sentence embeddings and predicts an embedding for the next
sentence, which is more efficient than predicting word embeddings. Notably this
allows us to consider a large number of candidates for the next sentence during
training. We demonstrate the effectiveness of our approach with
state-of-the-art accuracy on the unsupervised Story Cloze task and with
promising results on larger-scale next sentence prediction tasks.
Related papers
- CompoundPiece: Evaluating and Improving Decompounding Performance of
Language Models [77.45934004406283]
We systematically study decompounding, the task of splitting compound words into their constituents.
We introduce a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary.
We introduce a novel methodology to train dedicated models for decompounding.
arXiv Detail & Related papers (2023-05-23T16:32:27Z) - Efficient and Flexible Topic Modeling using Pretrained Embeddings and
Bag of Sentences [1.8592384822257952]
We propose a novel topic modeling and inference algorithm.
We leverage pre-trained sentence embeddings by combining generative process models and clustering.
TheTailor evaluation shows that our method yields state-of-the art results with relatively little computational demands.
arXiv Detail & Related papers (2023-02-06T20:13:11Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Few-shot Subgoal Planning with Language Models [58.11102061150875]
We show that language priors encoded in pre-trained language models allow us to infer fine-grained subgoal sequences.
In contrast to recent methods which make strong assumptions about subgoal supervision, our experiments show that language models can infer detailed subgoal sequences without any fine-tuning.
arXiv Detail & Related papers (2022-05-28T01:03:30Z) - A New Sentence Ordering Method Using BERT Pretrained Model [2.1793134762413433]
We propose a method for sentence ordering which does not need a training phase and consequently a large corpus for learning.
Our proposed method outperformed other baselines on ROCStories, a corpus of 5-sentence human-made stories.
Among other advantages of this method are its interpretability and needlessness to linguistic knowledge.
arXiv Detail & Related papers (2021-08-26T18:47:15Z) - Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding.
Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z) - Narrative Text Generation with a Latent Discrete Plan [39.71663365273463]
We propose a deep latent variable model that first samples a sequence of anchor words, one per sentence in the story, as part of its generative process.
During training, our model treats the sequence of anchor words as a latent variable and attempts to induce anchoring sequences that help guide generation in an unsupervised fashion.
We conduct human evaluations which demonstrate that the stories produced by our model are rated better in comparison with baselines which do not consider story plans.
arXiv Detail & Related papers (2020-10-07T08:45:37Z) - Pretraining with Contrastive Sentence Objectives Improves Discourse
Performance of Language Models [29.40992909208733]
We propose CONPONO, an inter-sentence objective for pretraining language models that models discourse coherence and the distance between sentences.
On the discourse representation benchmark DiscoEval, our model improves over the previous state-of-the-art by up to 13%.
We also show that CONPONO yields gains of 2%-6% absolute even for tasks that do not explicitly evaluate discourse.
arXiv Detail & Related papers (2020-05-20T23:21:43Z) - Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models
via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results.
We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks.
Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.