PALM: Pre-training an Autoencoding&Autoregressive Language Model for
Context-conditioned Generation
- URL: http://arxiv.org/abs/2004.07159v2
- Date: Sun, 20 Sep 2020 23:58:21 GMT
- Title: PALM: Pre-training an Autoencoding&Autoregressive Language Model for
Context-conditioned Generation
- Authors: Bin Bi, Chenliang Li, Chen Wu, Ming Yan, Wei Wang, Songfang Huang, Fei
Huang, Luo Si
- Abstract summary: Self-supervised pre-training has emerged as a powerful technique for natural language understanding and generation.
This work presents PALM with a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus.
An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks.
- Score: 92.7366819044397
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised pre-training, such as BERT, MASS and BART, has emerged as a
powerful technique for natural language understanding and generation. Existing
pre-training techniques employ autoencoding and/or autoregressive objectives to
train Transformer-based models by recovering original word tokens from
corrupted text with some masked tokens. The training goals of existing
techniques are often inconsistent with the goals of many language generation
tasks, such as generative question answering and conversational response
generation, for producing new text given context.
This work presents PALM with a novel scheme that jointly pre-trains an
autoencoding and autoregressive language model on a large unlabeled corpus,
specifically designed for generating new text conditioned on context. The new
scheme alleviates the mismatch introduced by the existing denoising scheme
between pre-training and fine-tuning where generation is more than
reconstructing original text. An extensive set of experiments show that PALM
achieves new state-of-the-art results on a variety of language generation
benchmarks covering generative question answering (Rank 1 on the official MARCO
leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword,
question generation on SQuAD, and conversational response generation on Cornell
Movie Dialogues.
Related papers
- Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques [0.0]
This research paper developed a novel approach to improve text generation in the context of joint Natural Language Generation (NLG) and Natural Language Understanding (NLU) learning.
The data is prepared by gathering and preprocessing annotated datasets, including cleaning, tokenization, stemming, and stop-word removal.
Transformer-based encoders and decoders, capturing long range dependencies and improving source-target sequence modelling.
Reinforcement learning with policy gradient techniques, semi-supervised training, improved attention mechanisms, and differentiable approximations are employed to fine-tune the models and handle complex linguistic tasks effectively.
arXiv Detail & Related papers (2024-10-17T12:43:49Z) - Text-Blueprint: An Interactive Platform for Plan-based Conditional
Generation [84.95981645040281]
Planning can be a useful intermediate step to render conditional generation less opaque and more grounded.
We present a web browser-based demonstration for query-focused summarization that uses a sequence of question-answer pairs.
arXiv Detail & Related papers (2023-04-28T18:14:48Z) - Sentence Bottleneck Autoencoders from Transformer Language Models [53.350633961266375]
We build a sentence-level autoencoder from a pretrained, frozen transformer language model.
We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder.
We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer, and single-sentence classification tasks in the GLUE benchmark, while using fewer parameters than large pretrained models.
arXiv Detail & Related papers (2021-08-31T19:39:55Z) - DialogBERT: Discourse-Aware Response Generation via Learning to Recover
and Rank Utterances [18.199473005335093]
This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models.
To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression.
Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines.
arXiv Detail & Related papers (2020-12-03T09:06:23Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - QURIOUS: Question Generation Pretraining for Text Generation [13.595014409069584]
We propose question generation as a pretraining method, which better aligns with the text generation objectives.
Our text generation models pretrained with this method are better at understanding the essence of the input and are better language models for the target task.
arXiv Detail & Related papers (2020-04-23T08:41:52Z) - ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework
for Natural Language Generation [44.21363470798758]
ERNIE-GEN is an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework.
It bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method.
It trains the model to predict semantically-complete spans consecutively rather than predicting word by word.
arXiv Detail & Related papers (2020-01-26T02:54:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.