May the Force Be with Your Copy Mechanism: Enhanced Supervised-Copy
Method for Natural Language Generation
- URL: http://arxiv.org/abs/2112.10360v1
- Date: Mon, 20 Dec 2021 06:54:28 GMT
- Title: May the Force Be with Your Copy Mechanism: Enhanced Supervised-Copy
Method for Natural Language Generation
- Authors: Sanghyuk Choi, Jeong-in Hwang, Hyungjong Noh, Yeonsoo Lee
- Abstract summary: We propose a novel supervised approach of a copy network that helps the model decide which words need to be copied and which need to be generated.
Specifically, we re-define the objective function, which leverages source sequences and target vocabularies as guidance for copying.
The experimental results on data-to-text generation and abstractive summarization tasks verify that our approach enhances the copying quality and improves the degree of abstractness.
- Score: 1.2453219864236247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent neural sequence-to-sequence models with a copy mechanism have achieved
remarkable progress in various text generation tasks. These models addressed
out-of-vocabulary problems and facilitated the generation of rare words.
However, the identification of the word which needs to be copied is difficult,
as observed by prior copy models, which suffer from incorrect generation and
lacking abstractness. In this paper, we propose a novel supervised approach of
a copy network that helps the model decide which words need to be copied and
which need to be generated. Specifically, we re-define the objective function,
which leverages source sequences and target vocabularies as guidance for
copying. The experimental results on data-to-text generation and abstractive
summarization tasks verify that our approach enhances the copying quality and
improves the degree of abstractness.
Related papers
- Mitigating Copy Bias in In-Context Learning through Neuron Pruning [74.91243772654519]
Large language models (LLMs) have demonstrated impressive few-shot in-context learning abilities.
They are sometimes prone to a copying bias', where they copy answers from provided examples instead of learning the underlying patterns.
We propose a novel and simple method to mitigate such copying bias.
arXiv Detail & Related papers (2024-10-02T07:18:16Z) - Copy Is All You Need [66.00852205068327]
We formulate text generation as progressively copying text segments from an existing text collection.
Our approach achieves better generation quality according to both automatic and human evaluations.
Our approach attains additional performance gains by simply scaling up to larger text collections.
arXiv Detail & Related papers (2023-07-13T05:03:26Z) - A Scalable and Efficient Iterative Method for Copying Machine Learning
Classifiers [0.802904964931021]
This paper introduces a novel sequential approach that significantly reduces the amount of computational resources needed to train or maintain a copy of a machine learning model.
The effectiveness of the sequential approach is demonstrated through experiments with synthetic and real-world datasets, showing significant reductions in time and resources, while maintaining or improving accuracy.
arXiv Detail & Related papers (2023-02-06T10:07:41Z) - Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image
Diffusion Models [103.61066310897928]
Recent text-to-image generative models have demonstrated an unparalleled ability to generate diverse and creative imagery guided by a target text prompt.
While revolutionary, current state-of-the-art diffusion models may still fail in generating images that fully convey the semantics in the given text prompt.
We analyze the publicly available Stable Diffusion model and assess the existence of catastrophic neglect, where the model fails to generate one or more of the subjects from the input prompt.
We introduce the concept of Generative Semantic Nursing (GSN), where we seek to intervene in the generative process on the fly during inference time to improve the faithfulness
arXiv Detail & Related papers (2023-01-31T18:10:38Z) - MOCHA: A Multi-Task Training Approach for Coherent Text Generation from
Cognitive Perspective [22.69509556890676]
We propose a novel multi-task training strategy for coherent text generation grounded on the cognitive theory of writing.
We extensively evaluate our model on three open-ended generation tasks including story generation, news article writing and argument generation.
arXiv Detail & Related papers (2022-10-26T11:55:41Z) - A Contrastive Framework for Neural Text Generation [46.845997620234265]
We show that an underlying reason for model degeneration is the anisotropic distribution of token representations.
We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model's representation space, and (ii) a decoding method -- contrastive search -- to encourage diversity while maintaining coherence in the generated text.
arXiv Detail & Related papers (2022-02-13T21:46:14Z) - To Point or Not to Point: Understanding How Abstractive Summarizers
Paraphrase Text [4.4044968357361745]
We characterize how one popular abstractive model, the pointer-generator model of See et al., uses its explicit copy/generation switch to control its level of abstraction.
When we modify the copy/generation switch and force the model to generate, only simple neural abilities are revealed alongside factual inaccuracies and hallucinations.
In line with previous research, these results suggest that abstractive summarization models lack the semantic understanding necessary to generate paraphrases that are both abstractive and faithful to the source document.
arXiv Detail & Related papers (2021-06-03T04:03:15Z) - Reinforced Generative Adversarial Network for Abstractive Text
Summarization [7.507096634112164]
Sequence-to-sequence models provide a viable new approach to generative summarization.
These models have three drawbacks: their grasp of the details of the original text is often inaccurate, and the text generated by such models often has repetitions.
We propose a new architecture that combines reinforcement learning and adversarial generative networks to enhance the sequence-to-sequence attention model.
arXiv Detail & Related papers (2021-05-31T17:34:47Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - PALM: Pre-training an Autoencoding&Autoregressive Language Model for
Context-conditioned Generation [92.7366819044397]
Self-supervised pre-training has emerged as a powerful technique for natural language understanding and generation.
This work presents PALM with a novel scheme that jointly pre-trains an autoencoding and autoregressive language model on a large unlabeled corpus.
An extensive set of experiments show that PALM achieves new state-of-the-art results on a variety of language generation benchmarks.
arXiv Detail & Related papers (2020-04-14T06:25:36Z) - Reverse Engineering Configurations of Neural Text Generation Models [86.9479386959155]
The study of artifacts that emerge in machine generated text as a result of modeling choices is a nascent research area.
We conduct an extensive suite of diagnostic tests to observe whether modeling choices leave detectable artifacts in the text they generate.
Our key finding, which is backed by a rigorous set of experiments, is that such artifacts are present and that different modeling choices can be inferred by observing the generated text alone.
arXiv Detail & Related papers (2020-04-13T21:02:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.