Seq2seq is All You Need for Coreference Resolution
- URL: http://arxiv.org/abs/2310.13774v1
- Date: Fri, 20 Oct 2023 19:17:22 GMT
- Title: Seq2seq is All You Need for Coreference Resolution
- Authors: Wenzheng Zhang, Sam Wiseman, Karl Stratos
- Abstract summary: We finetune a pretrained seq2seq transformer to map an input document to a tagged sequence encoding the coreference annotation.
Our model outperforms or closely matches the best coreference systems in the literature on an array of datasets.
- Score: 26.551602768015986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing works on coreference resolution suggest that task-specific models
are necessary to achieve state-of-the-art performance. In this work, we present
compelling evidence that such models are not necessary. We finetune a
pretrained seq2seq transformer to map an input document to a tagged sequence
encoding the coreference annotation. Despite the extreme simplicity, our model
outperforms or closely matches the best coreference systems in the literature
on an array of datasets. We also propose an especially simple seq2seq approach
that generates only tagged spans rather than the spans interleaved with the
original text. Our analysis shows that the model size, the amount of
supervision, and the choice of sequence representations are key factors in
performance.
Related papers
- Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference.
Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z) - Conditional set generation using Seq2seq models [52.516563721766445]
Conditional set generation learns a mapping from an input sequence of tokens to a set.
Sequence-to-sequence(Seq2seq) models are a popular choice to model set generation.
We propose a novel algorithm for effectively sampling informative orders over the space of label orders.
arXiv Detail & Related papers (2022-05-25T04:17:50Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Tiny Neural Models for Seq2Seq [0.0]
We propose a projection based encoder-decoder model referred to as pQRNN-MAtt.
The resulting quantized models are less than 3.5MB in size and are well suited for on-device latency critical applications.
We show that on MTOP, a challenging multilingual semantic parsing dataset, the average model performance surpasses LSTM based seq2seq model that uses pre-trained embeddings despite being 85x smaller.
arXiv Detail & Related papers (2021-08-07T00:39:42Z) - VAULT: VAriable Unified Long Text Representation for Machine Reading
Comprehension [31.639069657951747]
Existing models on Machine Reading require complex model architecture for modeling long texts with paragraph representation and classification.
We propose VAULT: a light-weight and parallel-efficient paragraph representation for MRC based on contextualized representation from long document input.
arXiv Detail & Related papers (2021-05-07T13:03:43Z) - Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting [54.03356526990088]
We propose Sequence Span Rewriting (SSR) as a self-supervised sequence-to-sequence (seq2seq) pre-training objective.
SSR provides more fine-grained learning signals for text representations by supervising the model to rewrite imperfect spans to ground truth.
Our experiments with T5 models on various seq2seq tasks show that SSR can substantially improve seq2seq pre-training.
arXiv Detail & Related papers (2021-01-02T10:27:11Z) - Pre-training for Abstractive Document Summarization by Reinstating
Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text.
Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z) - Abstractive Summarization with Combination of Pre-trained
Sequence-to-Sequence and Saliency Models [11.420640383826656]
We investigate the effectiveness of combining saliency models that identify the important parts of the source text with pre-trained seq-to-seq models.
Most of the combination models outperformed a simple fine-tuned seq-to-seq model on both the CNN/DM and XSum datasets.
arXiv Detail & Related papers (2020-03-29T14:00:25Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.