Retrieval-Enhanced Named Entity Recognition
- URL: http://arxiv.org/abs/2410.13118v1
- Date: Thu, 17 Oct 2024 01:12:48 GMT
- Title: Retrieval-Enhanced Named Entity Recognition
- Authors: Enzo Shiraishi, Raphael Y. de Camargo, Henrique L. P. Silva, Ronaldo C. Prati,
- Abstract summary: RENER is a technique for named entity recognition using autoregressive language models based on In-Context Learning and information retrieval techniques.
Experimental results show that in the CrossNER collection we achieve state-of-the-art performance with the proposed technique.
- Score: 1.2187048691454239
- License:
- Abstract: When combined with In-Context Learning, a technique that enables models to adapt to new tasks by incorporating task-specific examples or demonstrations directly within the input prompt, autoregressive language models have achieved good performance in a wide range of tasks and applications. However, this combination has not been properly explored in the context of named entity recognition, where the structure of this task poses unique challenges. We propose RENER (Retrieval-Enhanced Named Entity Recognition), a technique for named entity recognition using autoregressive language models based on In-Context Learning and information retrieval techniques. When presented with an input text, RENER fetches similar examples from a dataset of training examples that are used to enhance a language model to recognize named entities from this input text. RENER is modular and independent of the underlying language model and information retrieval algorithms. Experimental results show that in the CrossNER collection we achieve state-of-the-art performance with the proposed technique and that information retrieval can increase the F-score by up to 11 percentage points.
Related papers
- Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Evaluating Named Entity Recognition Using Few-Shot Prompting with Large Language Models [0.0]
Few-Shot Prompting or in-context learning enables models to recognize entities with minimal examples.
We assess state-of-the-art models like GPT-4 in NER tasks, comparing their few-shot performance to fully supervised benchmarks.
arXiv Detail & Related papers (2024-08-28T13:42:28Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER.
We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever.
In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z) - Informed Named Entity Recognition Decoding for Generative Language
Models [3.5323691899538128]
We propose Informed Named Entity Recognition Decoding (iNERD), which treats named entity recognition as a generative process.
We coarse-tune our model on a merged named entity corpus to strengthen its performance, evaluate five generative language models on eight named entity recognition datasets, and achieve remarkable results.
arXiv Detail & Related papers (2023-08-15T14:16:29Z) - Disambiguation of Company names via Deep Recurrent Networks [101.90357454833845]
We propose a Siamese LSTM Network approach to extract -- via supervised learning -- an embedding of company name strings.
We analyse how an Active Learning approach to prioritise the samples to be labelled leads to a more efficient overall learning pipeline.
arXiv Detail & Related papers (2023-03-07T15:07:57Z) - Dynamic Named Entity Recognition [5.9401550252715865]
We introduce a new task: Dynamic Named Entity Recognition (DNER)
DNER provides a framework to better evaluate the ability of algorithms to extract entities by exploiting the context.
We evaluate baseline models and present experiments reflecting issues and research axes related to this novel task.
arXiv Detail & Related papers (2023-02-16T15:50:02Z) - Nested Named Entity Recognition from Medical Texts: An Adaptive Shared
Network Architecture with Attentive CRF [53.55504611255664]
We propose a novel method, referred to as ASAC, to solve the dilemma caused by the nested phenomenon.
The proposed method contains two key modules: the adaptive shared (AS) part and the attentive conditional random field (ACRF) module.
Our model could learn better entity representations by capturing the implicit distinctions and relationships between different categories of entities.
arXiv Detail & Related papers (2022-11-09T09:23:56Z) - Improving Named Entity Recognition by External Context Retrieving and
Cooperative Learning [40.39647963185329]
We find external contexts of a sentence by retrieving and selecting a set of semantically relevant texts through a search engine.
We find empirically that the contextual representations computed on the retrieval-based input view can achieve significantly improved performance.
Experiments show that our approach can achieve new state-of-the-art performance on 8 NER data sets across 5 domains.
arXiv Detail & Related papers (2021-05-08T09:45:21Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.