Sentence-Incremental Neural Coreference Resolution
- URL: http://arxiv.org/abs/2305.16947v1
- Date: Fri, 26 May 2023 14:00:25 GMT
- Title: Sentence-Incremental Neural Coreference Resolution
- Authors: Matt Grenander, Shay B. Cohen, Mark Steedman
- Abstract summary: We propose a sentence-incremental neural coreference resolution system which incrementally builds clusters after marking mention boundaries in a shift-reduce method.
The system is aimed at bridging two recent approaches at coreference resolution: (1) state-of-the-art non-incremental models that incur quadratic complexity in document length with high computational cost, and (2) memory network-based models which operate incrementally but do not generalize beyond pronouns.
- Score: 32.13574453443377
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a sentence-incremental neural coreference resolution system which
incrementally builds clusters after marking mention boundaries in a
shift-reduce method. The system is aimed at bridging two recent approaches at
coreference resolution: (1) state-of-the-art non-incremental models that incur
quadratic complexity in document length with high computational cost, and (2)
memory network-based models which operate incrementally but do not generalize
beyond pronouns. For comparison, we simulate an incremental setting by
constraining non-incremental systems to form partial coreference chains before
observing new sentences. In this setting, our system outperforms comparable
state-of-the-art methods by 2 F1 on OntoNotes and 7 F1 on the CODI-CRAC 2021
corpus. In a conventional coreference setup, our system achieves 76.3 F1 on
OntoNotes and 45.8 F1 on CODI-CRAC 2021, which is comparable to
state-of-the-art baselines. We also analyze variations of our system and show
that the degree of incrementality in the encoder has a surprisingly large
effect on the resulting performance.
Related papers
- Autoregressive Large Language Models are Computationally Universal [59.34397993748194]
We show that autoregressive decoding of a transformer-based language model can realize universal computation.
We first show that a universal Turing machine can be simulated by a Lag system with 2027 production rules.
We conclude that, by the Church-Turing thesis, prompted gemini-1.5-pro-001 with extended autoregressive (greedy) decoding is a general purpose computer.
arXiv Detail & Related papers (2024-10-04T06:05:17Z) - Filling in the Gaps: Efficient Event Coreference Resolution using Graph
Autoencoder Networks [0.0]
We introduce a novel and efficient method for Event Coreference Resolution (ECR) applied to a lower-resourced language domain.
By framing ECR as a graph reconstruction task, we are able to combine deep semantic embeddings with structural coreference chain knowledge.
Our method significantly outperforms classical mention-pair methods on a large Dutch event coreference corpus.
arXiv Detail & Related papers (2023-10-18T13:44:58Z) - SICNN: Soft Interference Cancellation Inspired Neural Network Equalizers [1.6451639748812472]
We propose a novel neural network (NN)-based approach, referred to as SICNN.
SICNN is designed by deep unfolding a model-based iterative soft interference cancellation (SIC) method.
We compare the bit error ratio performance of the proposed NN-based equalizers with state-of-the-art model-based and NN-based approaches.
arXiv Detail & Related papers (2023-08-24T06:40:54Z) - SENSEi: Input-Sensitive Compilation for Accelerating GNNs [7.527596018706567]
We propose SENSEi, a system that exposes different sparse and dense matrix primitive compositions based on different matrix re-associations of GNN computations.
SENSEi executes in two stages: (1) an offline compilation stage that enumerates all valid re-associations leading to different sparse-dense matrix compositions and uses input-oblivious pruning techniques to prune away clearly unprofitable candidates.
On a wide range of configurations, SENSEi achieves speedups of up to $2.012times$ and $1.85times$ on graph convolutional networks and up to $6.294times$ and $16.274
arXiv Detail & Related papers (2023-06-27T02:24:05Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Hybrid Rule-Neural Coreference Resolution System based on Actor-Critic
Learning [53.73316523766183]
Coreference resolution systems need to tackle two main tasks.
One task is to detect all of the potential mentions, the other is to learn the linking of an antecedent for each possible mention.
We propose a hybrid rule-neural coreference resolution system based on actor-critic learning.
arXiv Detail & Related papers (2022-12-20T08:55:47Z) - UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units [64.61596752343837]
We present a novel two-pass direct S2ST architecture, UnitY, which first generates textual representations and predicts discrete acoustic units.
We enhance the model performance by subword prediction in the first-pass decoder.
We show that the proposed methods boost the performance even when predicting spectrogram in the second pass.
arXiv Detail & Related papers (2022-12-15T18:58:28Z) - DisCoDisCo at the DISRPT2021 Shared Task: A System for Discourse
Segmentation, Classification, and Connective Detection [4.371388370559826]
Our system, called DisCoDisCo, enhances contextualized word embeddings with hand-crafted features.
Results on relation classification suggest strong performance on the new 2021 benchmark.
A partial evaluation of multiple pre-trained Transformer-based language models indicates that models pre-trained on the Next Sentence Prediction task are optimal for relation classification.
arXiv Detail & Related papers (2021-09-20T18:11:05Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.