NERDA-Con: Extending NER models for Continual Learning -- Integrating
Distinct Tasks and Updating Distribution Shifts
- URL: http://arxiv.org/abs/2206.14607v1
- Date: Tue, 28 Jun 2022 03:22:55 GMT
- Title: NERDA-Con: Extending NER models for Continual Learning -- Integrating
Distinct Tasks and Updating Distribution Shifts
- Authors: Supriti Vijay and Aman Priyanshu
- Abstract summary: We propose NERDA-Con, a pipeline for training NERs with Large Language Models (LLMs) bases.
As we believe our work has implications to be utilized in the pipeline of continual learning and NER, we open-source our code as well as provide the fine-tuning library of the same name NERDA-Con.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With increasing applications in areas such as biomedical information
extraction pipelines and social media analytics, Named Entity Recognition (NER)
has become an indispensable tool for knowledge extraction. However, with the
gradual shift in language structure and vocabulary, NERs are plagued with
distribution shifts, making them redundant or not as profitable without
re-training. Re-training NERs based on Large Language Models (LLMs) from
scratch over newly acquired data poses economic disadvantages. In contrast,
re-training only with newly acquired data will result in Catastrophic
Forgetting of previously acquired knowledge. Therefore, we propose NERDA-Con, a
pipeline for training NERs with LLM bases by incorporating the concept of
Elastic Weight Consolidation (EWC) into the NER fine-tuning NERDA pipeline. As
we believe our work has implications to be utilized in the pipeline of
continual learning and NER, we open-source our code as well as provide the
fine-tuning library of the same name NERDA-Con at
https://github.com/SupritiVijay/NERDA-Con and
https://pypi.org/project/NERDA-Con/.
Related papers
- WhisperNER: Unified Open Named Entity and Speech Recognition [15.535663273628147]
We introduce WhisperNER, a novel model that allows joint speech transcription and entity recognition.
WhisperNER supports open-type NER, enabling recognition of diverse and evolving entities at inference.
Our experiments demonstrate that WhisperNER outperforms natural baselines on both out-of-domain open type NER and supervised finetuning.
arXiv Detail & Related papers (2024-09-12T15:00:56Z) - NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data [41.94295877935867]
We show how to create NuNER, a compact language representation model specialized in the Named Entity Recognition task.
NuNER can be fine-tuned to solve downstream NER problems in a data-efficient way.
We find that the size and entity-type diversity of the pre-training dataset are key to achieving good performance.
arXiv Detail & Related papers (2024-02-23T14:23:51Z) - In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER.
We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever.
In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z) - Gaussian Prior Reinforcement Learning for Nested Named Entity
Recognition [52.46740830977898]
We propose a novel seq2seq model named GPRL, which formulates the nested NER task as an entity triplet sequence generation process.
Experiments on three nested NER datasets demonstrate that GPRL outperforms previous nested NER models.
arXiv Detail & Related papers (2023-05-12T05:55:34Z) - T-NER: An All-Round Python Library for Transformer-based Named Entity
Recognition [9.928025283928282]
T-NER is a Python library for NER LM finetuning.
We show the potential of the library by compiling nine public NER datasets into a unified format.
To facilitate future research, we also release all our LM checkpoints via the Hugging Face model hub.
arXiv Detail & Related papers (2022-09-09T15:00:38Z) - Optimizing Bi-Encoder for Named Entity Recognition via Contrastive
Learning [80.36076044023581]
We present an efficient bi-encoder framework for named entity recognition (NER)
We frame NER as a metric learning problem that maximizes the similarity between the vector representations of an entity mention and its type.
A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions.
arXiv Detail & Related papers (2022-08-30T23:19:04Z) - Nested Named Entity Recognition as Holistic Structure Parsing [92.8397338250383]
This work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all.
Experiments show that our model yields promising results on widely-used benchmarks which approach or even achieve state-of-the-art.
arXiv Detail & Related papers (2022-04-17T12:48:20Z) - $k$NN-NER: Named Entity Recognition with Nearest Neighbor Search [47.901071142524906]
$k$ nearest neighbor NER ($k$NN-NER) framework augments the distribution of entity labels by assigning $k$ nearest neighbors retrieved from the training set.
$k$NN-NER requires no additional operation during the training phase, and by interpolating $k$ nearest neighbors search into the vanilla NER model, $k$NN-NER consistently outperforms its vanilla counterparts.
arXiv Detail & Related papers (2022-03-31T15:21:43Z) - ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition [16.43239147870092]
We propose an Adversarial Trained LSTM-CNN (ASTRAL) system to improve the current NER method from both the model structure and the training process.
Our system is evaluated on three benchmarks, CoNLL-03, OntoNotes 5.0, and WNUT-17, achieving state-of-the-art results.
arXiv Detail & Related papers (2020-09-02T13:15:25Z) - BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant
Supervision [49.42215511723874]
We propose a new computational framework -- BOND -- to improve the prediction performance of NER models.
Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels.
In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance.
arXiv Detail & Related papers (2020-06-28T04:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.