Optimizing Bi-Encoder for Named Entity Recognition via Contrastive
Learning
- URL: http://arxiv.org/abs/2208.14565v1
- Date: Tue, 30 Aug 2022 23:19:04 GMT
- Title: Optimizing Bi-Encoder for Named Entity Recognition via Contrastive
Learning
- Authors: Sheng Zhang, Hao Cheng, Jianfeng Gao, Hoifung Poon
- Abstract summary: We present an efficient bi-encoder framework for named entity recognition (NER)
We frame NER as a metric learning problem that maximizes the similarity between the vector representations of an entity mention and its type.
A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions.
- Score: 80.36076044023581
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present an efficient bi-encoder framework for named entity recognition
(NER), which applies contrastive learning to map candidate text spans and
entity types into the same vector representation space. Prior work
predominantly approaches NER as sequence labeling or span classification. We
instead frame NER as a metric learning problem that maximizes the similarity
between the vector representations of an entity mention and its type. This
makes it easy to handle nested and flat NER alike, and can better leverage
noisy self-supervision signals. A major challenge to this bi-encoder
formulation for NER lies in separating non-entity spans from entity mentions.
Instead of explicitly labeling all non-entity spans as the same class Outside
(O) as in most prior methods, we introduce a novel dynamic thresholding loss,
which is learned in conjunction with the standard contrastive loss. Experiments
show that our method performs well in both supervised and distantly supervised
settings, for nested and flat NER alike, establishing new state of the art
across standard datasets in the general domain (e.g., ACE2004, ACE2005) and
high-value verticals such as biomedicine (e.g., GENIA, NCBI, BC5CDR, JNLPBA).
Related papers
- Uncovering Prototypical Knowledge for Weakly Open-Vocabulary Semantic
Segmentation [59.37587762543934]
This paper studies the problem of weakly open-vocabulary semantic segmentation (WOVSS)
Existing methods suffer from a granularity inconsistency regarding the usage of group tokens.
We propose the prototypical guidance network (PGSeg) that incorporates multi-modal regularization.
arXiv Detail & Related papers (2023-10-29T13:18:00Z) - MProto: Multi-Prototype Network with Denoised Optimal Transport for
Distantly Supervised Named Entity Recognition [75.87566793111066]
We propose a noise-robust prototype network named MProto for the DS-NER task.
MProto represents each entity type with multiple prototypes to characterize the intra-class variance.
To mitigate the noise from incomplete labeling, we propose a novel denoised optimal transport (DOT) algorithm.
arXiv Detail & Related papers (2023-10-12T13:02:34Z) - Named Entity Recognition via Machine Reading Comprehension: A Multi-Task
Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types.
We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z) - Gaussian Prior Reinforcement Learning for Nested Named Entity
Recognition [52.46740830977898]
We propose a novel seq2seq model named GPRL, which formulates the nested NER task as an entity triplet sequence generation process.
Experiments on three nested NER datasets demonstrate that GPRL outperforms previous nested NER models.
arXiv Detail & Related papers (2023-05-12T05:55:34Z) - Nested Named Entity Recognition as Latent Lexicalized Constituency
Parsing [29.705133932275892]
Recently, (Fu et al, 2021) adapt a span-based constituency to tackle nested NER.
In this work, we resort to more expressive structures, lexicalized constituency trees in which constituents are annotated by headwords.
We leverage the Eisner-Satta algorithm to perform partial marginalization and inference efficiently.
arXiv Detail & Related papers (2022-03-09T12:02:59Z) - Bottom-Up Constituency Parsing and Nested Named Entity Recognition with
Pointer Networks [24.337440797369702]
Constituency parsing and nested named entity recognition (NER) are typical textitnested structured prediction tasks.
We propose a novel global pointing mechanism for bottom-up parsing with pointer networks to do both tasks, which needs linear steps to parse.
Our method obtains the state-of-the-art performance on PTB among all BERT-based models (96.01 F1 score) and competitive performance on CTB7 in constituency parsing.
arXiv Detail & Related papers (2021-10-11T17:01:43Z) - BoningKnife: Joint Entity Mention Detection and Typing for Nested NER
via prior Boundary Knowledge [1.5149438988761574]
We propose a joint entity mention detection and typing model via prior boundary knowledge (BoningKnife) to better handle nested NER extraction and recognition tasks.
BoningKnife consists of two modules, MentionTagger and TypeClassifier.
Experiments over different datasets show that our approach outperforms previous state of the art methods and achieves 86.41, 85.46, and 94.2 F1 scores on ACE2004, ACE2005, and NNE, respectively.
arXiv Detail & Related papers (2021-07-20T11:44:36Z) - A Sequence-to-Set Network for Nested Named Entity Recognition [38.05786148160635]
We propose a novel sequence-to-set neural network for nested NER.
We use a non-autoregressive decoder to predict the final set of entities in one pass.
Experimental results show that our proposed model achieves state-of-the-art on three nested NER corpora.
arXiv Detail & Related papers (2021-05-19T03:10:04Z) - Locate and Label: A Two-stage Identifier for Nested Named Entity
Recognition [9.809157050048375]
We propose a two-stage entity identifier for named entity recognition.
First, we generate span proposals by filtering and boundary regression on the seed spans to locate the entities, and then label the boundary-adjusted span proposals with the corresponding categories.
Our method effectively utilizes the boundary information of entities and partially matched spans during training.
arXiv Detail & Related papers (2021-05-14T12:52:34Z) - Cross-domain Speech Recognition with Unsupervised Character-level
Distribution Matching [60.8427677151492]
We propose CMatch, a Character-level distribution matching method to perform fine-grained adaptation between each character in two domains.
Experiments on the Libri-Adapt dataset show that our proposed approach achieves 14.39% and 16.50% relative Word Error Rate (WER) reduction on both cross-device and cross-environment ASR.
arXiv Detail & Related papers (2021-04-15T14:36:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.