NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data
- URL: http://arxiv.org/abs/2402.15343v1
- Date: Fri, 23 Feb 2024 14:23:51 GMT
- Title: NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data
- Authors: Sergei Bogdanov, Alexandre Constantin, Timoth\'ee Bernard, Benoit
Crabb\'e, Etienne Bernard
- Abstract summary: We show how to create NuNER, a compact language representation model specialized in the Named Entity Recognition task.
NuNER can be fine-tuned to solve downstream NER problems in a data-efficient way.
We find that the size and entity-type diversity of the pre-training dataset are key to achieving good performance.
- Score: 41.94295877935867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have shown impressive abilities in data
annotation, opening the way for new approaches to solve classic NLP problems.
In this paper, we show how to use LLMs to create NuNER, a compact language
representation model specialized in the Named Entity Recognition (NER) task.
NuNER can be fine-tuned to solve downstream NER problems in a data-efficient
way, outperforming similar-sized foundation models in the few-shot regime and
competing with much larger LLMs. We find that the size and entity-type
diversity of the pre-training dataset are key to achieving good performance. We
view NuNER as a member of the broader family of task-specific foundation
models, recently unlocked by LLMs.
Related papers
- GEIC: Universal and Multilingual Named Entity Recognition with Large Language Models [7.714969840571947]
We introduce the task of generation-based extraction and in-context classification (GEIC)
We then propose CascadeNER, a universal and multilingual GEIC framework for few-shot and zero-shot NER.
We also introduce AnythingNER, the first NER dataset specifically designed for Large Language Models (LLMs)
arXiv Detail & Related papers (2024-09-17T09:32:12Z) - GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks [0.0]
We will introduce a new kind of GLiNER model that can be used for various information extraction tasks while being a small encoder model.
Our model achieved SoTA performance on zero-shot NER benchmarks and leading performance on question-answering, summarization and relation extraction tasks.
arXiv Detail & Related papers (2024-06-14T13:54:29Z) - VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition [3.4923338594757674]
Large language models (LLMs) can be used to train a model capable of extracting various types of entities.
In this paper, we utilize the open-sourced LLM LLaMA2 as the backbone model, and design specific instructions to distinguish between different types of entities and datasets.
Our model VANER, trained with a small partition of parameters, significantly outperforms previous LLMs-based models and, for the first time, as a model based on LLM, surpasses the majority of conventional state-of-the-art BioNER systems.
arXiv Detail & Related papers (2024-04-27T09:00:39Z) - ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models [25.68491572293656]
Large Language Models fall short in structured knowledge extraction tasks such as named entity recognition.
This paper explores an innovative, cost-efficient strategy to harness LLMs with modest NER capabilities for producing superior NER datasets.
arXiv Detail & Related papers (2024-03-17T06:12:43Z) - LinkNER: Linking Local Named Entity Recognition Models to Large Language
Models using Uncertainty [12.32180790849948]
Named Entity Recognition serves as a fundamental task in natural language understanding.
Fine-tuned NER models exhibit satisfactory performance on standard NER benchmarks.
However, due to limited fine-tuning data and lack of knowledge, it performs poorly on unseen entity recognition.
arXiv Detail & Related papers (2024-02-16T11:02:29Z) - In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER.
We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever.
In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z) - LLM Augmented LLMs: Expanding Capabilities through Composition [56.40953749310957]
CALM -- Composition to Augment Language Models -- introduces cross-attention between models to compose their representations and enable new capabilities.
We illustrate that augmenting PaLM2-S with a smaller model trained on low-resource languages results in an absolute improvement of up to 13% on tasks like translation into English.
When PaLM2-S is augmented with a code-specific model, we see a relative improvement of 40% over the base model for code generation and explanation tasks.
arXiv Detail & Related papers (2024-01-04T18:53:01Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z) - Learning In-context Learning for Named Entity Recognition [54.022036267886214]
Named entity recognition in real-world applications suffers from the diversity of entity types, the emergence of new entity types, and the lack of high-quality annotations.
This paper proposes an in-context learning-based NER approach, which can effectively inject in-context NER ability into PLMs.
We show that our method can effectively inject in-context NER ability into PLMs and significantly outperforms the PLMs+fine-tuning counterparts.
arXiv Detail & Related papers (2023-05-18T15:31:34Z) - GPT-NER: Named Entity Recognition via Large Language Models [58.609582116612934]
GPT-NER transforms the sequence labeling task to a generation task that can be easily adapted by Language Models.
We find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce.
This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
arXiv Detail & Related papers (2023-04-20T16:17:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.