CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence
Models
- URL: http://arxiv.org/abs/2010.15266v1
- Date: Wed, 28 Oct 2020 22:45:16 GMT
- Title: CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence
Models
- Authors: Abhinav Singh, Patrick Xia, Guanghui Qin, Mahsa Yarmohammadi, Benjamin
Van Durme
- Abstract summary: We present a model with an explicit token-level copy operation and extend it to copying entire spans.
Our model provides hard alignments between spans in the input and output, allowing for nontraditional applications of seq2seq, like information extraction.
- Score: 31.832217465573503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Copy mechanisms are employed in sequence to sequence models (seq2seq) to
generate reproductions of words from the input to the output. These frameworks,
operating at the lexical type level, fail to provide an explicit alignment that
records where each token was copied from. Further, they require contiguous
token sequences from the input (spans) to be copied individually. We present a
model with an explicit token-level copy operation and extend it to copying
entire spans. Our model provides hard alignments between spans in the input and
output, allowing for nontraditional applications of seq2seq, like information
extraction. We demonstrate the approach on Nested Named Entity Recognition,
achieving near state-of-the-art accuracy with an order of magnitude increase in
decoding speed.
Related papers
- Object Recognition as Next Token Prediction [99.40793702627396]
We present an approach to pose object recognition as next token prediction.
The idea is to apply a language decoder that auto-regressively predicts the text tokens from image embeddings to form labels.
arXiv Detail & Related papers (2023-12-04T18:58:40Z) - Seq2seq is All You Need for Coreference Resolution [26.551602768015986]
We finetune a pretrained seq2seq transformer to map an input document to a tagged sequence encoding the coreference annotation.
Our model outperforms or closely matches the best coreference systems in the literature on an array of datasets.
arXiv Detail & Related papers (2023-10-20T19:17:22Z) - TokenSplit: Using Discrete Speech Representations for Direct, Refined,
and Transcript-Conditioned Speech Separation and Recognition [51.565319173790314]
TokenSplit is a sequence-to-sequence encoder-decoder model that uses the Transformer architecture.
We show that our model achieves excellent performance in terms of separation, both with or without transcript conditioning.
We also measure the automatic speech recognition (ASR) performance and provide audio samples of speech synthesis to demonstrate the additional utility of our model.
arXiv Detail & Related papers (2023-08-21T01:52:01Z) - Copy Is All You Need [66.00852205068327]
We formulate text generation as progressively copying text segments from an existing text collection.
Our approach achieves better generation quality according to both automatic and human evaluations.
Our approach attains additional performance gains by simply scaling up to larger text collections.
arXiv Detail & Related papers (2023-07-13T05:03:26Z) - Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - Automatic Label Sequence Generation for Prompting Sequence-to-sequence
Models [105.4590533269863]
We propose AutoSeq, a fully automatic prompting method.
We adopt natural language prompts on sequence-to-sequence models.
Our method reveals the potential of sequence-to-sequence models in few-shot learning.
arXiv Detail & Related papers (2022-09-20T01:35:04Z) - May the Force Be with Your Copy Mechanism: Enhanced Supervised-Copy
Method for Natural Language Generation [1.2453219864236247]
We propose a novel supervised approach of a copy network that helps the model decide which words need to be copied and which need to be generated.
Specifically, we re-define the objective function, which leverages source sequences and target vocabularies as guidance for copying.
The experimental results on data-to-text generation and abstractive summarization tasks verify that our approach enhances the copying quality and improves the degree of abstractness.
arXiv Detail & Related papers (2021-12-20T06:54:28Z) - BioCopy: A Plug-And-Play Span Copy Mechanism in Seq2Seq Models [3.823919891699282]
We propose a plug-and-play architecture, namely BioCopy, to alleviate the problem of losing essential tokens while copying long spans.
Specifically, in the training stage, we construct a BIO tag for each token and train the original model with BIO tags jointly.
In the inference stage, the model will firstly predict the BIO tag at each time step, then conduct different mask strategies based on the predicted BIO label.
Experimental results on two separate generative tasks show that they all outperform the baseline models by adding our BioCopy to the original model structure.
arXiv Detail & Related papers (2021-09-26T08:55:26Z) - Lexicon-constrained Copying Network for Chinese Abstractive
Summarization [0.0]
Copy mechanism allows sequence-to-sequence models to choose words from the input and put them directly into the output.
Most existing models for Chinese abstractive summarization can only perform character copy.
We propose a lexicon-constrained copying network that models multi-granularity in both memory and decoder.
arXiv Detail & Related papers (2020-10-16T06:59:34Z) - Copy that! Editing Sequences by Copying Spans [40.23377412674599]
We present an extension of seq2seq models capable of copying entire spans of the input to the output in one step.
In experiments on a range of editing tasks of natural language and source code, we show that our new model consistently outperforms simpler baselines.
arXiv Detail & Related papers (2020-06-08T17:42:18Z) - Cascaded Text Generation with Markov Transformers [122.76100449018061]
Two dominant approaches to neural text generation are fully autoregressive models, using serial beam search decoding, and non-autoregressive models, using parallel decoding with no output dependencies.
This work proposes an autoregressive model with sub-linear parallel time generation. Noting that conditional random fields with bounded context can be decoded in parallel, we propose an efficient cascaded decoding approach for generating high-quality output.
This approach requires only a small modification from standard autoregressive training, while showing competitive accuracy/speed tradeoff compared to existing methods on five machine translation datasets.
arXiv Detail & Related papers (2020-06-01T17:52:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.