LinkNER: Linking Local Named Entity Recognition Models to Large Language
Models using Uncertainty
- URL: http://arxiv.org/abs/2402.10573v2
- Date: Tue, 27 Feb 2024 15:08:02 GMT
- Title: LinkNER: Linking Local Named Entity Recognition Models to Large Language
Models using Uncertainty
- Authors: Zhen Zhang, Yuhua Zhao, Hang Gao, and Mengting Hu
- Abstract summary: Named Entity Recognition serves as a fundamental task in natural language understanding.
Fine-tuned NER models exhibit satisfactory performance on standard NER benchmarks.
However, due to limited fine-tuning data and lack of knowledge, it performs poorly on unseen entity recognition.
- Score: 12.32180790849948
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Named Entity Recognition (NER) serves as a fundamental task in natural
language understanding, bearing direct implications for web content analysis,
search engines, and information retrieval systems. Fine-tuned NER models
exhibit satisfactory performance on standard NER benchmarks. However, due to
limited fine-tuning data and lack of knowledge, it performs poorly on unseen
entity recognition. As a result, the usability and reliability of NER models in
web-related applications are compromised. Instead, Large Language Models (LLMs)
like GPT-4 possess extensive external knowledge, but research indicates that
they lack specialty for NER tasks. Furthermore, non-public and large-scale
weights make tuning LLMs difficult. To address these challenges, we propose a
framework that combines small fine-tuned models with LLMs (LinkNER) and an
uncertainty-based linking strategy called RDC that enables fine-tuned models to
complement black-box LLMs, achieving better performance. We experiment with
both standard NER test sets and noisy social media datasets. LinkNER enhances
NER task performance, notably surpassing SOTA models in robustness tests. We
also quantitatively analyze the influence of key components like uncertainty
estimation methods, LLMs, and in-context learning on diverse NER tasks,
offering specific web-related recommendations.
Related papers
- DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models [1.747623282473278]
Deep learning models operate as opaque 'black boxes' with limited transparency in their decision-making processes.
This study addresses the pressing need for interpretability in AI systems, emphasizing its role in fostering trust, ensuring accountability, and promoting responsible deployment in mission-critical fields.
We introduce DLBacktrace, an innovative technique developed by the AryaXAI team to illuminate model decisions across a wide array of domains.
arXiv Detail & Related papers (2024-11-19T16:54:30Z) - Neurosymbolic AI approach to Attribution in Large Language Models [5.3454230926797734]
Neurosymbolic AI (NesyAI) combines the strengths of neural networks with structured symbolic reasoning.
This paper explores how NesyAI frameworks can enhance existing attribution models, offering more reliable, interpretable, and adaptable systems.
arXiv Detail & Related papers (2024-09-30T02:20:36Z) - GEIC: Universal and Multilingual Named Entity Recognition with Large Language Models [7.714969840571947]
We introduce the task of generation-based extraction and in-context classification (GEIC)
We then propose CascadeNER, a universal and multilingual GEIC framework for few-shot and zero-shot NER.
We also introduce AnythingNER, the first NER dataset specifically designed for Large Language Models (LLMs)
arXiv Detail & Related papers (2024-09-17T09:32:12Z) - ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models [25.68491572293656]
Large Language Models fall short in structured knowledge extraction tasks such as named entity recognition.
This paper explores an innovative, cost-efficient strategy to harness LLMs with modest NER capabilities for producing superior NER datasets.
arXiv Detail & Related papers (2024-03-17T06:12:43Z) - NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data [41.94295877935867]
We show how to create NuNER, a compact language representation model specialized in the Named Entity Recognition task.
NuNER can be fine-tuned to solve downstream NER problems in a data-efficient way.
We find that the size and entity-type diversity of the pre-training dataset are key to achieving good performance.
arXiv Detail & Related papers (2024-02-23T14:23:51Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - E-NER: Evidential Deep Learning for Trustworthy Named Entity Recognition [69.87816981427858]
Most named entity recognition (NER) systems focus on improving model performance, ignoring the need to quantify model uncertainty.
Evidential deep learning (EDL) has recently been proposed as a promising solution to explicitly model predictive uncertainty for classification tasks.
We propose a trustworthy NER framework named E-NER by introducing two uncertainty-guided loss terms to the conventional EDL, along with a series of uncertainty-guided training strategies.
arXiv Detail & Related papers (2023-05-29T02:36:16Z) - A Confidence-based Partial Label Learning Model for Crowd-Annotated
Named Entity Recognition [74.79785063365289]
Existing models for named entity recognition (NER) are mainly based on large-scale labeled datasets.
We propose a Confidence-based Partial Label Learning (CPLL) method to integrate the prior confidence (given by annotators) and posterior confidences (learned by models) for crowd-annotated NER.
arXiv Detail & Related papers (2023-05-21T15:31:23Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.