Related papers: MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective

MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective

URL: http://arxiv.org/abs/2210.14650v1
Date: Wed, 26 Oct 2022 11:55:41 GMT
Title: MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective
Authors: Zhe Hu, Hou Pong Chan, Lifu Huang
Abstract summary: We propose a novel multi-task training strategy for coherent text generation grounded on the cognitive theory of writing. We extensively evaluate our model on three open-ended generation tasks including story generation, news article writing and argument generation.
Score: 22.69509556890676
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Teaching neural models to generate narrative coherent texts is a critical problem. Recent pre-trained language models have achieved promising results, but there is still a gap between human written texts and machine-generated outputs. In this work, we propose a novel multi-task training strategy for coherent text generation grounded on the cognitive theory of writing, which empowers the model to learn essential subskills needed for writing including planning and reviewing besides end-to-end generation. We extensively evaluate our model on three open-ended generation tasks including story generation, news article writing and argument generation. Experiments show that our model achieves better results on both few-shot and fully-supervised settings than strong baselines, and human evaluations confirm that our model can generate more coherent outputs.

Related papers

Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents. Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z)
Beyond Turing: A Comparative Analysis of Approaches for Detecting Machine-Generated Text [1.919654267936118]
Traditional shallow learning, Language Model (LM) fine-tuning, and Multilingual Model fine-tuning are evaluated. Results reveal considerable differences in performance across methods. This study paves the way for future research aimed at creating robust and highly discriminative models.
arXiv Detail & Related papers (2023-11-21T06:23:38Z)
Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text. We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality. We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z)
Leveraging Natural Supervision for Language Representation Learning and Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision. We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks. We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z)
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization [89.94078728495423]
We show that recent advances in each modality, CLIP image representations and scaling of language models, do not consistently improve multimodal self-rationalization of tasks with multimodal inputs. Our findings call for a backbone modelling approach that can be built on to advance text generation from images and text beyond image captioning.
arXiv Detail & Related papers (2022-05-24T00:52:40Z)
Twist Decoding: Diverse Generators Guide Each Other [116.20780037268801]
We introduce Twist decoding, a simple and general inference algorithm that generates text while benefiting from diverse models. Our method does not assume the vocabulary, tokenization or even generation order is shared.
arXiv Detail & Related papers (2022-05-19T01:27:53Z)
DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Text Generation [10.477090501569284]
We study the task of long-form opinion text generation, which faces at least two distinct challenges. Existing neural generation models fall short of coherence, thus requiring efficient content planning. We propose DYPLOC, a generation framework that conducts dynamic planning of content while generating the output based on a novel design of mixed language models.
arXiv Detail & Related papers (2021-06-01T20:56:10Z)
Facts2Story: Controlling Text Generation by Key Facts [0.0]
We propose a controlled generation task based on expanding a sequence of facts, expressed in natural language, into a longer narrative. We show that while auto-regressive, unidirectional Language Models such as GPT2 produce better fluency, they struggle to adhere to the requested facts. We propose a plan-and-cloze model (using fine-tuned XLNet) which produces competitive fluency while adhering to the requested content.
arXiv Detail & Related papers (2020-12-08T10:14:29Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
Unsupervised Text Generation by Learning from Search [86.51619839836331]
TGLS is a novel framework to unsupervised Text Generation by Learning. We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, paraphrase generation and text formalization.
arXiv Detail & Related papers (2020-07-09T04:34:48Z)
QURIOUS: Question Generation Pretraining for Text Generation [13.595014409069584]
We propose question generation as a pretraining method, which better aligns with the text generation objectives. Our text generation models pretrained with this method are better at understanding the essence of the input and are better language models for the target task.
arXiv Detail & Related papers (2020-04-23T08:41:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.