Improving Language Generation with Sentence Coherence Objective
- URL: http://arxiv.org/abs/2009.06358v1
- Date: Mon, 7 Sep 2020 06:10:03 GMT
- Title: Improving Language Generation with Sentence Coherence Objective
- Authors: Ruixiao Sun, Jie Yang, Mehrdad Yousefzadeh
- Abstract summary: Existing models are often prone to output paragraphs of texts that gradually diverge from the given prompt.
The goal of our project is to improve the coherence and consistency across sentences in a language-generation model.
- Score: 4.997730662279843
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conditional story generation and contextual text continuation have become
increasingly popular topics in NLP community. Existing models are often prone
to output paragraphs of texts that gradually diverge from the given prompt.
Although the generated text may have a reasonable perplexity and diversity, it
could easily be identified by human as gibberish. The goal of our project is to
improve the coherence and consistency across sentences in a language-generation
model. We aim to solve this issue by first training a sentence pair coherence
classifier with GPT-2 pretrained model, and then co-train the GPT-2 language
model with this new coherence objective using a method analogous to the
REINFORCE algorithm. This fine-tuned language model is able to generate lengthy
paragraph conditioned on a given topic without diverging too much. The
simplicity of this model allows it to be applicable to a variety of underlying
language model architecture since it only modifies the final layer of the
pre-trained model.
Related papers
- Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation [82.5217996570387]
We adapt a pre-trained language model for auto-regressive text-to-image generation.
We find that pre-trained language models offer limited help.
arXiv Detail & Related papers (2023-11-27T07:19:26Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - TopNet: Learning from Neural Topic Model to Generate Long Stories [43.5564336855688]
Long story generation (LSG) is one of the coveted goals in natural language processing.
We propose emphTopNet to obtain high-quality skeleton words to complement the short input.
Our proposed framework is highly effective in skeleton word selection and significantly outperforms state-of-the-art models in both automatic evaluation and human evaluation.
arXiv Detail & Related papers (2021-12-14T09:47:53Z) - Long Text Generation by Modeling Sentence-Level and Discourse-Level
Coherence [59.51720326054546]
We propose a long text generation model, which can represent the prefix sentences at sentence level and discourse level in the decoding process.
Our model can generate more coherent texts than state-of-the-art baselines.
arXiv Detail & Related papers (2021-05-19T07:29:08Z) - Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese
Pre-trained Language Models [62.41139712595334]
We propose a novel pre-training paradigm for Chinese -- Lattice-BERT.
We construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers.
We show that our model can bring an average increase of 1.5% under the 12-layer setting.
arXiv Detail & Related papers (2021-04-15T02:36:49Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.