Enhancing Label Consistency on Document-level Named Entity Recognition
- URL: http://arxiv.org/abs/2210.12949v1
- Date: Mon, 24 Oct 2022 04:45:17 GMT
- Title: Enhancing Label Consistency on Document-level Named Entity Recognition
- Authors: Minbyul Jeong, Jaewoo Kang
- Abstract summary: Named entity recognition (NER) is a fundamental part of extracting information from documents in biomedical applications.
We present our method, ConNER, which enhances the label dependency of modifier (e.g., adjectives and prepositions) to achieve higher label agreement.
The effectiveness of our method is demonstrated on four popular biomedical NER datasets.
- Score: 19.249781091058605
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Named entity recognition (NER) is a fundamental part of extracting
information from documents in biomedical applications. A notable advantage of
NER is its consistency in extracting biomedical entities in a document context.
Although existing document NER models show consistent predictions, they still
do not meet our expectations. We investigated whether the adjectives and
prepositions within an entity cause a low label consistency, which results in
inconsistent predictions. In this paper, we present our method, ConNER, which
enhances the label dependency of modifiers (e.g., adjectives and prepositions)
to achieve higher label agreement. ConNER refines the draft labels of the
modifiers to improve the output representations of biomedical entities. The
effectiveness of our method is demonstrated on four popular biomedical NER
datasets; in particular, its efficacy is proved on two datasets with 7.5-8.6%
absolute improvements in the F1 score. We interpret that our ConNER method is
effective on datasets that have intrinsically low label consistency. In the
qualitative analysis, we demonstrate how our approach makes the NER model
generate consistent predictions. Our code and resources are available at
https://github.com/dmis-lab/ConNER/.
Related papers
- A Unified Label-Aware Contrastive Learning Framework for Few-Shot Named Entity Recognition [6.468625143772815]
We propose a unified label-aware token-level contrastive learning framework.
Our approach enriches the context by utilizing label semantics as suffix prompts.
It simultaneously optimize context-native and context-label contrastive learning objectives.
arXiv Detail & Related papers (2024-04-26T06:19:21Z) - Uncertainty Estimation on Sequential Labeling via Uncertainty Transmission [21.426225910784364]
NER tasks aim to extract entities and predict their labels given a text.
This work focuses on UE-NER, which aims to estimate uncertainty scores for the NER predictions.
We propose a Sequential Labeling Posterior Network (SLPN) to estimate uncertainty scores for the extracted entities.
arXiv Detail & Related papers (2023-11-15T06:36:29Z) - Injecting Categorical Labels and Syntactic Information into Biomedical
NER [28.91836510067532]
We present a simple approach to improve biomedical named entity recognition (NER) by injecting categorical labels and Part-of-speech (POS) information into the model.
Experiments on three benchmark datasets show that incorporating categorical label information with syntactic context is quite useful and outperforms baseline BERT-based models.
arXiv Detail & Related papers (2023-11-06T14:03:59Z) - Named Entity Recognition via Machine Reading Comprehension: A Multi-Task
Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types.
We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z) - A Confidence-based Partial Label Learning Model for Crowd-Annotated
Named Entity Recognition [74.79785063365289]
Existing models for named entity recognition (NER) are mainly based on large-scale labeled datasets.
We propose a Confidence-based Partial Label Learning (CPLL) method to integrate the prior confidence (given by annotators) and posterior confidences (learned by models) for crowd-annotated NER.
arXiv Detail & Related papers (2023-05-21T15:31:23Z) - How to tackle an emerging topic? Combining strong and weak labels for
Covid news NER [90.90053968189156]
We introduce a novel COVID-19 news NER dataset (COVIDNEWS-NER)
We release 3000 entries of hand annotated strongly labelled sentences and 13000 auto-generated weakly labelled sentences.
We show the effectiveness of CONTROSTER on COVIDNEWS-NER while providing analysis on combining weak and strong labels for training.
arXiv Detail & Related papers (2022-09-29T21:33:02Z) - Optimizing Bi-Encoder for Named Entity Recognition via Contrastive
Learning [80.36076044023581]
We present an efficient bi-encoder framework for named entity recognition (NER)
We frame NER as a metric learning problem that maximizes the similarity between the vector representations of an entity mention and its type.
A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions.
arXiv Detail & Related papers (2022-08-30T23:19:04Z) - A Theory-Driven Self-Labeling Refinement Method for Contrastive
Representation Learning [111.05365744744437]
Unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives.
In this work, we first prove that for contrastive learning, inaccurate label assignment heavily impairs its generalization for semantic instance discrimination.
Inspired by this theory, we propose a novel self-labeling refinement approach for contrastive learning.
arXiv Detail & Related papers (2021-06-28T14:24:52Z) - Named Entity Recognition with Small Strongly Labeled and Large Weakly
Labeled Data [37.980010197914105]
Weak supervision has shown promising results in many natural language processing tasks, such as Named Entity Recognition (NER)
We propose a new multi-stage computational framework -- NEEDLE with three essential ingredients: weak label completion, noise-aware loss function, and final fine-tuning over the strongly labeled data.
We demonstrate that NEEDLE can effectively suppress the noise of the weak labels and outperforms existing methods.
arXiv Detail & Related papers (2021-06-16T17:18:14Z) - Exploiting Global Contextual Information for Document-level Named Entity
Recognition [46.99922251839363]
We propose a model called Global Context enhanced Document-level NER (GCDoc)
At word-level, a document graph is constructed to model a wider range of dependencies between words.
At sentence-level, for appropriately modeling wider context beyond single sentence, we employ a cross-sentence module.
Our model reaches F1 score of 92.22 (93.40 with BERT) on CoNLL 2003 dataset and 88.32 (90.49 with BERT) on Ontonotes 5.0 dataset.
arXiv Detail & Related papers (2021-06-02T01:52:07Z) - Validating Label Consistency in NER Data Annotation [34.24378200299595]
In this work, we present an empirical method to explore the relationship between label (in-)consistency and NER model performance.
In experiments, our method identified the label inconsistency of test data in SCIERC and CoNLL03 datasets.
arXiv Detail & Related papers (2021-01-21T16:19:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.