Cross-document Coreference Resolution over Predicted Mentions
- URL: http://arxiv.org/abs/2106.01210v1
- Date: Wed, 2 Jun 2021 14:56:28 GMT
- Title: Cross-document Coreference Resolution over Predicted Mentions
- Authors: Arie Cattan, Alon Eirew, Gabriel Stanovsky, Mandar Joshi, Ido Dagan
- Abstract summary: We introduce the first end-to-end model for CD coreference resolution from raw text.
Our model achieves competitive results for event and entity coreference resolution on gold mentions.
- Score: 19.95214898312209
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Coreference resolution has been mostly investigated within a single document
scope, showing impressive progress in recent years based on end-to-end models.
However, the more challenging task of cross-document (CD) coreference
resolution remained relatively under-explored, with the few recent models
applied only to gold mentions. Here, we introduce the first end-to-end model
for CD coreference resolution from raw text, which extends the prominent model
for within-document coreference to the CD setting. Our model achieves
competitive results for event and entity coreference resolution on gold
mentions. More importantly, we set first baseline results, on the standard ECB+
dataset, for CD coreference resolution over predicted mentions. Further, our
model is simpler and more efficient than recent CD coreference resolution
systems, while not using any external resources.
Related papers
- Efficient Document Ranking with Learnable Late Interactions [73.41976017860006]
Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval.
To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized query and document embeddings.
Recently, late-interaction models have been proposed to realize more favorable latency-quality tradeoffs, by using a DE structure followed by a lightweight scorer.
arXiv Detail & Related papers (2024-06-25T22:50:48Z) - A New Learning Paradigm for Foundation Model-based Remote Sensing Change
Detection [54.01158175996638]
Change detection (CD) is a critical task to observe and analyze dynamic processes of land cover.
We propose a Bi-Temporal Adapter Network (BAN), which is a universal foundation model-based CD adaptation framework.
arXiv Detail & Related papers (2023-12-02T15:57:17Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Document-Level Relation Extraction with Sentences Importance Estimation
and Focusing [52.069206266557266]
Document-level relation extraction (DocRE) aims to determine the relation between two entities from a document of multiple sentences.
We propose a Sentence Estimation and Focusing (SIEF) framework for DocRE, where we design a sentence importance score and a sentence focusing loss.
Experimental results on two domains show that our SIEF not only improves overall performance, but also makes DocRE models more robust.
arXiv Detail & Related papers (2022-04-27T03:20:07Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - On Generalization in Coreference Resolution [66.05112218880907]
We consolidate a set of 8 coreference resolution datasets targeting different domains to evaluate the off-the-shelf performance of models.
We then mix three datasets for training; even though their domain, annotation guidelines, and metadata differ, we propose a method for jointly training a single model.
We find that in a zero-shot setting, models trained on a single dataset transfer poorly while joint training yields improved overall performance.
arXiv Detail & Related papers (2021-09-20T16:33:22Z) - Sequential Cross-Document Coreference Resolution [14.099694053823765]
Cross-document coreference resolution is important for the growing interest in multi-document analysis tasks.
We propose a new model that extends the efficient sequential prediction paradigm for coreference resolution to cross-document settings.
Our model incrementally composes mentions into cluster representations and predicts links between a mention and the already constructed clusters.
arXiv Detail & Related papers (2021-04-17T00:46:57Z) - WEC: Deriving a Large-scale Cross-document Event Coreference dataset
from Wikipedia [14.324743524196874]
We present Wikipedia Event Coreference (WEC), an efficient methodology for gathering a large-scale dataset for cross-document event coreference from Wikipedia.
We apply this methodology to the English Wikipedia and extract our large-scale WEC-Eng dataset.
We develop an algorithm that adapts components of state-of-the-art models for within-document coreference resolution to the cross-document setting.
arXiv Detail & Related papers (2021-04-11T14:54:35Z) - CD2CR: Co-reference Resolution Across Documents and Domains [20.30046972135548]
Cross-document co-reference resolution (CDCR) is the task of identifying and linking mentions to entities and concepts across many text documents.
We propose a new task and English language dataset for cross-document cross-domain co-reference resolution (CD$2$CR)
We show that in this cross-domain, cross-document setting, existing CDCR models do not perform well and we provide a baseline model that outperforms current state-of-the-art CDCR models on CD$2$CR.
arXiv Detail & Related papers (2021-01-29T15:18:30Z) - Streamlining Cross-Document Coreference Resolution: Evaluation and
Modeling [25.94435242086499]
Recent evaluation protocols for Cross-document (CD) coreference resolution have often been inconsistent or lenient.
Our primary contribution is proposing a pragmatic evaluation methodology which assumes access to only raw text.
Our model adapts and extends recent neural models for within-document coreference resolution to address the CD coreference setting.
arXiv Detail & Related papers (2020-09-23T10:02:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.