Related papers: PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation

PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation

URL: http://arxiv.org/abs/2004.07159v2
Date: Sun, 20 Sep 2020 23:58:21 GMT
Title: PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation
Authors: Bin Bi, Chenliang Li, Chen Wu, Ming Yan, Wei Wang, Songfang Huang, Fei Huang, Luo Si
Abstract summary: Self-supervised pre-training has emerged as a powerful technique for natural language understanding and generation. This work presents PALM with a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus. An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks.
Score: 92.7366819044397
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised pre-training, such as BERT, MASS and BART, has emerged as a powerful technique for natural language understanding and generation. Existing pre-training techniques employ autoencoding and/or autoregressive objectives to train Transformer-based models by recovering original word tokens from corrupted text with some masked tokens. The training goals of existing techniques are often inconsistent with the goals of many language generation tasks, such as generative question answering and conversational response generation, for producing new text given context. This work presents PALM with a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus, specifically designed for generating new text conditioned on context. The new scheme alleviates the mismatch introduced by the existing denoising scheme between pre-training and fine-tuning where generation is more than reconstructing original text. An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks covering generative question answering (Rank 1 on the official MARCO leaderboard), abstractive summarization on CNN/DailyMail as well as Gigaword, question generation on SQuAD, and conversational response generation on Cornell Movie Dialogues.

Related papers

SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation [55.61004653386632]
Large Language Models (LLMs) often produce hallucinations, i.e., information that is unfaithful or not grounded in the input context. This paper introduces a novel self-supervised method for generating a training set of unfaithful samples. We then refine the model using a training process that encourages the generation of grounded outputs over unfaithful ones.
arXiv Detail & Related papers (2025-02-19T12:31:58Z)
Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques [0.0]
This research paper developed a novel approach to improve text generation in the context of joint Natural Language Generation (NLG) and Natural Language Understanding (NLU) learning. The data is prepared by gathering and preprocessing annotated datasets, including cleaning, tokenization, stemming, and stop-word removal. Transformer-based encoders and decoders, capturing long range dependencies and improving source-target sequence modelling. Reinforcement learning with policy gradient techniques, semi-supervised training, improved attention mechanisms, and differentiable approximations are employed to fine-tune the models and handle complex linguistic tasks effectively.
arXiv Detail & Related papers (2024-10-17T12:43:49Z)
Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation [84.95981645040281]
Planning can be a useful intermediate step to render conditional generation less opaque and more grounded. We present a web browser-based demonstration for query-focused summarization that uses a sequence of question-answer pairs.
arXiv Detail & Related papers (2023-04-28T18:14:48Z)
Sentence Bottleneck Autoencoders from Transformer Language Models [53.350633961266375]
We build a sentence-level autoencoder from a pretrained, frozen transformer language model. We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder. We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer, and single-sentence classification tasks in the GLUE benchmark, while using fewer parameters than large pretrained models.
arXiv Detail & Related papers (2021-08-31T19:39:55Z)
DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances [18.199473005335093]
This paper presents DialogBERT, a novel conversational response generation model that enhances previous PLM-based dialogue models. To efficiently capture the discourse-level coherence among utterances, we propose two training objectives, including masked utterance regression. Experiments on three multi-turn conversation datasets show that our approach remarkably outperforms the baselines.
arXiv Detail & Related papers (2020-12-03T09:06:23Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes. An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z)
POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation. The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner. The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z)
QURIOUS: Question Generation Pretraining for Text Generation [13.595014409069584]
We propose question generation as a pretraining method, which better aligns with the text generation objectives. Our text generation models pretrained with this method are better at understanding the essence of the input and are better language models for the target task.
arXiv Detail & Related papers (2020-04-23T08:41:52Z)
ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation [44.21363470798758]
ERNIE-GEN is an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework. It bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method. It trains the model to predict semantically-complete spans consecutively rather than predicting word by word.
arXiv Detail & Related papers (2020-01-26T02:54:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.