Incorporating Singletons and Mention-based Features in Coreference
Resolution via Multi-task Learning for Better Generalization
- URL: http://arxiv.org/abs/2309.11582v1
- Date: Wed, 20 Sep 2023 18:44:24 GMT
- Title: Incorporating Singletons and Mention-based Features in Coreference
Resolution via Multi-task Learning for Better Generalization
- Authors: Yilun Zhu, Siyao Peng, Sameer Pradhan, Amir Zeldes
- Abstract summary: This paper presents a coreference model that learns singletons as well as features such as entity type and information status.
This approach achieves new state-of-the-art scores on the OntoGUM benchmark.
- Score: 12.084539012992412
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous attempts to incorporate a mention detection step into end-to-end
neural coreference resolution for English have been hampered by the lack of
singleton mention span data as well as other entity information. This paper
presents a coreference model that learns singletons as well as features such as
entity type and information status via a multi-task learning-based approach.
This approach achieves new state-of-the-art scores on the OntoGUM benchmark
(+2.7 points) and increases robustness on multiple out-of-domain datasets (+2.3
points on average), likely due to greater generalizability for mention
detection and utilization of more data from singletons when compared to only
coreferent mention pair matching.
Related papers
- SPLICE: A Singleton-Enhanced PipeLIne for Coreference REsolution [11.062090350704617]
Singleton mentions, i.e.entities mentioned only once in a text, are important to how humans understand discourse from a theoretical perspective.
Previous attempts to incorporate their detection in end-to-end neural coreference resolution for English have been hampered by the lack of singleton mention spans in the OntoNotes benchmark.
This paper addresses this limitation by combining predicted mentions from existing nested NER systems and features derived from OntoNotes syntax trees.
arXiv Detail & Related papers (2024-03-25T22:46:16Z) - Anchor Points: Benchmarking Models with Much Fewer Examples [88.02417913161356]
In six popular language classification benchmarks, model confidence in the correct class on many pairs of points is strongly correlated across models.
We propose Anchor Point Selection, a technique to select small subsets of datasets that capture model behavior across the entire dataset.
Just several anchor points can be used to estimate model per-class predictions on all other points in a dataset with low mean absolute error.
arXiv Detail & Related papers (2023-09-14T17:45:51Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Mention Annotations Alone Enable Efficient Domain Adaptation for
Coreference Resolution [8.08448832546021]
We show that annotating mentions alone is nearly twice as fast as annotating full coreference chains.
Our approach facilitates annotation-efficient transfer and results in a 7-14% improvement in average F1 without increasing annotator time.
arXiv Detail & Related papers (2022-10-14T07:57:27Z) - Instance-Level Relative Saliency Ranking with Graph Reasoning [126.09138829920627]
We present a novel unified model to segment salient instances and infer relative saliency rank order.
A novel loss function is also proposed to effectively train the saliency ranking branch.
experimental results demonstrate that our proposed model is more effective than previous methods.
arXiv Detail & Related papers (2021-07-08T13:10:42Z) - Adaptive Prototypical Networks with Label Words and Joint Representation
Learning for Few-Shot Relation Classification [17.237331828747006]
This work focuses on few-shot relation classification (FSRC)
We propose an adaptive mixture mechanism to add label words to the representation of the class prototype.
Experiments have been conducted on FewRel under different few-shot (FS) settings.
arXiv Detail & Related papers (2021-01-10T11:25:42Z) - Coarse-to-Fine Memory Matching for Joint Retrieval and Classification [0.7081604594416339]
We present a novel end-to-end language model for joint retrieval and classification.
We evaluate it on the standard blind test set of the FEVER fact verification dataset.
We extend exemplar auditing to this setting for analyzing and constraining the model.
arXiv Detail & Related papers (2020-11-29T05:06:03Z) - Active Learning for Coreference Resolution using Discrete Annotation [76.36423696634584]
We improve upon pairwise annotation for active learning in coreference resolution.
We ask annotators to identify mention antecedents if a presented mention pair is deemed not coreferent.
In experiments with existing benchmark coreference datasets, we show that the signal from this additional question leads to significant performance gains per human-annotation hour.
arXiv Detail & Related papers (2020-04-28T17:17:11Z) - Selecting Relevant Features from a Multi-domain Representation for
Few-shot Classification [91.67977602992657]
We propose a new strategy based on feature selection, which is both simpler and more effective than previous feature adaptation approaches.
We show that a simple non-parametric classifier built on top of such features produces high accuracy and generalizes to domains never seen during training.
arXiv Detail & Related papers (2020-03-20T15:44:17Z) - Pairwise Similarity Knowledge Transfer for Weakly Supervised Object
Localization [53.99850033746663]
We study the problem of learning localization model on target classes with weakly supervised image labels.
In this work, we argue that learning only an objectness function is a weak form of knowledge transfer.
Experiments on the COCO and ILSVRC 2013 detection datasets show that the performance of the localization model improves significantly with the inclusion of pairwise similarity function.
arXiv Detail & Related papers (2020-03-18T17:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.