Related papers: Seq2seq is All You Need for Coreference Resolution

Seq2seq is All You Need for Coreference Resolution

URL: http://arxiv.org/abs/2310.13774v1
Date: Fri, 20 Oct 2023 19:17:22 GMT
Title: Seq2seq is All You Need for Coreference Resolution
Authors: Wenzheng Zhang, Sam Wiseman, Karl Stratos
Abstract summary: We finetune a pretrained seq2seq transformer to map an input document to a tagged sequence encoding the coreference annotation. Our model outperforms or closely matches the best coreference systems in the literature on an array of datasets.
Score: 26.551602768015986
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Existing works on coreference resolution suggest that task-specific models are necessary to achieve state-of-the-art performance. In this work, we present compelling evidence that such models are not necessary. We finetune a pretrained seq2seq transformer to map an input document to a tagged sequence encoding the coreference annotation. Despite the extreme simplicity, our model outperforms or closely matches the best coreference systems in the literature on an array of datasets. We also propose an especially simple seq2seq approach that generates only tagged spans rather than the spans interleaved with the original text. Our analysis shows that the model size, the amount of supervision, and the choice of sequence representations are key factors in performance.

Related papers

To token or not to token: A Comparative Study of Text Representations for Cross-Lingual Transfer [23.777874316083984]
We propose a scoring Language Quotient metric capable of providing a weighted representation of both zero-shot and few-shot evaluation combined. Our analysis reveals that image-based models excel in cross-lingual transfer when languages are closely related and share visually similar scripts. In dependency parsing tasks where word relationships play a crucial role, models with their character-level focus, outperform others.
arXiv Detail & Related papers (2023-10-12T06:59:10Z)
Hierarchical Phrase-based Sequence-to-Sequence Learning [94.10257313923478]
We describe a neural transducer that maintains the flexibility of standard sequence-to-sequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference. Our approach trains two models: a discriminative derivation based on a bracketing grammar whose tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-by-one.
arXiv Detail & Related papers (2022-11-15T05:22:40Z)
Conditional set generation using Seq2seq models [52.516563721766445]
Conditional set generation learns a mapping from an input sequence of tokens to a set. Sequence-to-sequence(Seq2seq) models are a popular choice to model set generation. We propose a novel algorithm for effectively sampling informative orders over the space of label orders.
arXiv Detail & Related papers (2022-05-25T04:17:50Z)
Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects. Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency. We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z)
Tiny Neural Models for Seq2Seq [0.0]
We propose a projection based encoder-decoder model referred to as pQRNN-MAtt. The resulting quantized models are less than 3.5MB in size and are well suited for on-device latency critical applications. We show that on MTOP, a challenging multilingual semantic parsing dataset, the average model performance surpasses LSTM based seq2seq model that uses pre-trained embeddings despite being 85x smaller.
arXiv Detail & Related papers (2021-08-07T00:39:42Z)
VAULT: VAriable Unified Long Text Representation for Machine Reading Comprehension [31.639069657951747]
Existing models on Machine Reading require complex model architecture for modeling long texts with paragraph representation and classification. We propose VAULT: a light-weight and parallel-efficient paragraph representation for MRC based on contextualized representation from long document input.
arXiv Detail & Related papers (2021-05-07T13:03:43Z)
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting [54.03356526990088]
We propose Sequence Span Rewriting (SSR) as a self-supervised sequence-to-sequence (seq2seq) pre-training objective. SSR provides more fine-grained learning signals for text representations by supervising the model to rewrite imperfect spans to ground truth. Our experiments with T5 models on various seq2seq tasks show that SSR can substantially improve seq2seq pre-training.
arXiv Detail & Related papers (2021-01-02T10:27:11Z)
Pre-training for Abstractive Document Summarization by Reinstating Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text. Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z)
Abstractive Summarization with Combination of Pre-trained Sequence-to-Sequence and Saliency Models [11.420640383826656]
We investigate the effectiveness of combining saliency models that identify the important parts of the source text with pre-trained seq-to-seq models. Most of the combination models outperformed a simple fine-tuned seq-to-seq model on both the CNN/DM and XSum datasets.
arXiv Detail & Related papers (2020-03-29T14:00:25Z)
Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words" Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.