Efficient Seq2seq Coreference Resolution Using Entity Representations
- URL: http://arxiv.org/abs/2510.14504v1
- Date: Thu, 16 Oct 2025 09:50:03 GMT
- Title: Efficient Seq2seq Coreference Resolution Using Entity Representations
- Authors: Matt Grenander, Shay B. Cohen, Mark Steedman,
- Abstract summary: We show that discarding a wide portion of tokens in seq2seq resolvers is a feasible strategy for incremental coreference resolution.<n>Our results indicate that discarding a wide portion of tokens in seq2seq resolvers is a feasible strategy for incremental coreference resolution.
- Score: 28.577787866461758
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Seq2seq coreference models have introduced a new paradigm for coreference resolution by learning to generate text corresponding to coreference labels, without requiring task-specific parameters. While these models achieve new state-of-the-art performance, they do so at the cost of flexibility and efficiency. In particular, they do not efficiently handle incremental settings such as dialogue, where text must processed sequentially. We propose a compressed representation in order to improve the efficiency of these methods in incremental settings. Our method works by extracting and re-organizing entity-level tokens, and discarding the majority of other input tokens. On OntoNotes, our best model achieves just 0.6 CoNLL F1 points below a full-prefix, incremental baseline while achieving a compression ratio of 1.8. On LitBank, where singleton mentions are annotated, it passes state-of-the-art performance. Our results indicate that discarding a wide portion of tokens in seq2seq resolvers is a feasible strategy for incremental coreference resolution.
Related papers
- SCOPE: Saliency-Coverage Oriented Token Pruning for Efficient Multimodel LLMs [59.415473779171315]
We propose a novel visual token pruning strategy called textbfSaliency-textbfCoverage textbfOriented token textbfPruning for textbfEfficient MLLMs.
arXiv Detail & Related papers (2025-10-28T09:29:37Z) - Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations [83.93566096400723]
We find that instruction-tuned models retain up to 93.4% of their original performance when given a randomly sampled tokenization.<n>Character-level segmentation improves string manipulation and code understanding tasks by up to +14%.<n>Right-aligned digit grouping enhances large-number arithmetic by +33%.
arXiv Detail & Related papers (2025-06-23T18:02:26Z) - Training-Free Tokenizer Transplantation via Orthogonal Matching Pursuit [45.18582668677648]
We present a training-free method to transplant tokenizers in large language models.<n>We approximate each out-of-vocabulary token as a sparse linear combination of shared tokens.<n>We show that OMP achieves best zero-shot preservation of the base model's performance.
arXiv Detail & Related papers (2025-06-07T00:51:27Z) - Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning [0.0]
Pretrained language models (LLMs) are often constrained by their fixed tokenization schemes.<n> Tokenadapt is a model-agnostic tokenizer transplantation method.<n>Our framework introduces two innovations: first, Tokenadapt, a model-agnostic tokenizer transplantation method, and second, novel pre-tokenization for multi-word Supertokens.
arXiv Detail & Related papers (2025-05-14T19:00:27Z) - COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement [80.18490952057125]
Iterative refinement has emerged as an effective paradigm for enhancing the capabilities of large language models (LLMs) on complex tasks.
We propose Context-Wise Order-Agnostic Language Modeling (COrAL) to overcome these challenges.
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally.
arXiv Detail & Related papers (2024-10-12T23:56:19Z) - Object Recognition as Next Token Prediction [99.40793702627396]
We present an approach to pose object recognition as next token prediction.
The idea is to apply a language decoder that auto-regressively predicts the text tokens from image embeddings to form labels.
arXiv Detail & Related papers (2023-12-04T18:58:40Z) - Seq2seq is All You Need for Coreference Resolution [26.551602768015986]
We finetune a pretrained seq2seq transformer to map an input document to a tagged sequence encoding the coreference annotation.
Our model outperforms or closely matches the best coreference systems in the literature on an array of datasets.
arXiv Detail & Related papers (2023-10-20T19:17:22Z) - AWTE-BERT:Attending to Wordpiece Tokenization Explicitly on BERT for
Joint Intent Classification and SlotFilling [5.684659127683238]
BERT (Bidirectional Representations from Transformers) achieves the joint optimization of the two tasks.
We propose a novel joint model based on BERT, which explicitly models the multiple sub-tokens features after wordpiece tokenization.
Experimental results demonstrate that our proposed model achieves significant improvement on intent classification accuracy, slot filling F1, and sentence-level semantic frame accuracy.
arXiv Detail & Related papers (2022-11-27T13:49:19Z) - DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for
Controllable Text Generation [6.844825905212349]
We propose a new CTG approach, namely DisCup, which incorporates the attribute knowledge of discriminator to optimize the control-prompts.
DisCup can achieve a new state-of-the-art control performance while maintaining an efficient and high-quality text generation, only relying on around 10 virtual tokens.
arXiv Detail & Related papers (2022-10-18T02:59:06Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - Active Learning for Coreference Resolution using Discrete Annotation [76.36423696634584]
We improve upon pairwise annotation for active learning in coreference resolution.
We ask annotators to identify mention antecedents if a presented mention pair is deemed not coreferent.
In experiments with existing benchmark coreference datasets, we show that the signal from this additional question leads to significant performance gains per human-annotation hour.
arXiv Detail & Related papers (2020-04-28T17:17:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.