Large Language Models for Few-Shot Named Entity Recognition
- URL: http://arxiv.org/abs/1810.06818v3
- Date: Wed, 29 Oct 2025 14:50:15 GMT
- Title: Large Language Models for Few-Shot Named Entity Recognition
- Authors: Yufei Zhao, Xiaoshi Zhong, Erik Cambria, Jagath C. Rajapakse,
- Abstract summary: GPT4NER constructs effective prompts using three key components: entity definition, few-shot examples, and chain-of-thought.<n>We conduct experiments on two benchmark datasets, CoNLL2003 and OntoNotes5.0, and compare the performance of GPT4NER to representative state-of-the-art models.
- Score: 42.753496136556286
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named entity recognition (NER) is a fundamental task in numerous downstream applications. Recently, researchers have employed pre-trained language models (PLMs) and large language models (LLMs) to address this task. However, fully leveraging the capabilities of PLMs and LLMs with minimal human effort remains challenging. In this paper, we propose GPT4NER, a method that prompts LLMs to resolve the few-shot NER task. GPT4NER constructs effective prompts using three key components: entity definition, few-shot examples, and chain-of-thought. By prompting LLMs with these effective prompts, GPT4NER transforms few-shot NER, which is traditionally considered as a sequence-labeling problem, into a sequence-generation problem. We conduct experiments on two benchmark datasets, CoNLL2003 and OntoNotes5.0, and compare the performance of GPT4NER to representative state-of-the-art models in both few-shot and fully supervised settings. Experimental results demonstrate that GPT4NER achieves the $F_1$ of 83.15\% on CoNLL2003 and 70.37\% on OntoNotes5.0, significantly outperforming few-shot baselines by an average margin of 7 points. Compared to fully-supervised baselines, GPT4NER achieves 87.9\% of their best performance on CoNLL2003 and 76.4\% of their best performance on OntoNotes5.0. We also utilize a relaxed-match metric for evaluation and report performance in the sub-task of named entity extraction (NEE), and experiments demonstrate their usefulness to help better understand model behaviors in the NER task.
Related papers
- Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark [62.58869921806019]
We propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset.
We design innovative training strategies to effectively distill GPT-4o's evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6.
Experimental results demonstrate that our distilled open-source MLLM significantly outperforms the current state-of-the-art GPT-4o-base baseline.
arXiv Detail & Related papers (2024-11-23T08:06:06Z) - Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine against COVID-19 Literature: Comparative Study [4.680391123850371]
We established a dataset of 389 articles on TCM against COVID-19, and manually annotated 48 of them with 6 types of entities belonging to 3 domains as the ground truth.
We then performed NER tasks for the 6 entity types using ChatGPT (GPT-3.5 and GPT-4) and 4 state-of-the-art BERT-based question-answering (QA) models.
arXiv Detail & Related papers (2024-08-24T06:59:55Z) - FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios [0.5106912532044251]
We introduce FsPONER, a novel approach for optimizing few-shot prompts, and evaluate its performance on domain-specific NER datasets.<n>FsPONER consists of three few-shot selection methods based on random sampling, TF-IDF, and a combination of both.<n>In the considered real-world scenarios with data scarcity, FsPONER with TF-IDF surpasses fine-tuned models by approximately 10% in F1 score.
arXiv Detail & Related papers (2024-07-10T20:32:50Z) - ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition [18.884124657093405]
We introduce ELLEN, a simple, fully modular, neuro-symbolic method that blends fine-tuned language models with linguistic rules.
ELLEN achieves very strong performance on the CoNLL-2003 dataset.
In a zero-shot setting, ELLEN also achieves over 75% of the performance of a strong, fully supervised model trained on gold data.
arXiv Detail & Related papers (2024-03-26T05:11:51Z) - Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance [11.595274304409937]
Large language models (LLMs) have revolutionized zero-shot task performance.
Current methods using trigger phrases such as "Let's think step by step" remain limited.
This study introduces PRomPTed, an approach that optimize the zero-shot prompts for individual task instances.
arXiv Detail & Related papers (2023-10-03T14:51:34Z) - UniversalNER: Targeted Distillation from Large Language Models for Open
Named Entity Recognition [48.977866466971655]
We show how ChatGPT can be distilled into much smaller UniversalNER models for open NER.
We assemble the largest NER benchmark to date, comprising 43 datasets across 9 diverse domains.
With a tiny fraction of parameters, UniversalNER not only acquires ChatGPT's capability in recognizing arbitrary entity types, but also outperforms its NER accuracy by 7-9 absolute F1 points in average.
arXiv Detail & Related papers (2023-08-07T03:39:52Z) - PromptNER: A Prompting Method for Few-shot Named Entity Recognition via
k Nearest Neighbor Search [56.81939214465558]
We propose PromptNER: a novel prompting method for few-shot NER via k nearest neighbor search.
We use prompts that contains entity category information to construct label prototypes, which enables our model to fine-tune with only the support set.
Our approach achieves excellent transfer learning ability, and extensive experiments on the Few-NERD and CrossNER datasets demonstrate that our model achieves superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2023-05-20T15:47:59Z) - GPT-NER: Named Entity Recognition via Large Language Models [58.609582116612934]
GPT-NER transforms the sequence labeling task to a generation task that can be easily adapted by Language Models.
We find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce.
This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
arXiv Detail & Related papers (2023-04-20T16:17:26Z) - Reframing Instructional Prompts to GPTk's Language [72.69833640335519]
We propose reframing techniques for model designers to create effective prompts for language models.
Our results show that reframing improves few-shot learning performance by 14% while reducing sample complexity.
The performance gains are particularly important on large language models, such as GPT3 where tuning models or prompts on large datasets is not feasible.
arXiv Detail & Related papers (2021-09-16T09:44:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.