Label-Guided In-Context Learning for Named Entity Recognition
- URL: http://arxiv.org/abs/2505.23722v1
- Date: Thu, 29 May 2025 17:54:32 GMT
- Title: Label-Guided In-Context Learning for Named Entity Recognition
- Authors: Fan Bai, Hamid Hassanzadeh, Ardavan Saeedi, Mark Dredze,
- Abstract summary: In-context learning (ICL) enables large language models to perform new tasks using only a few demonstrations.<n>We introduce DEER, a new method that leverages training labels through token-level statistics to improve ICL performance.
- Score: 14.63059248497416
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-context learning (ICL) enables large language models (LLMs) to perform new tasks using only a few demonstrations. In Named Entity Recognition (NER), demonstrations are typically selected based on semantic similarity to the test instance, ignoring training labels and resulting in suboptimal performance. We introduce DEER, a new method that leverages training labels through token-level statistics to improve ICL performance. DEER first enhances example selection with a label-guided, token-based retriever that prioritizes tokens most informative for entity recognition. It then prompts the LLM to revisit error-prone tokens, which are also identified using label statistics, and make targeted corrections. Evaluated on five NER datasets using four different LLMs, DEER consistently outperforms existing ICL methods and approaches the performance of supervised fine-tuning. Further analysis shows its effectiveness on both seen and unseen entities and its robustness in low-resource settings.
Related papers
- MAPLE: Many-Shot Adaptive Pseudo-Labeling for In-Context Learning [53.02571749383208]
In-Context Learning (ICL) empowers Large Language Models (LLMs) to tackle diverse tasks by incorporating multiple input-output examples.<n>Many-Shot Adaptive Pseudo-LabEling (MAPLE) is a novel influence-based many-shot ICL framework that utilizes pseudo-labeled samples to compensate for the lack of label information.
arXiv Detail & Related papers (2025-05-22T04:54:27Z) - CLLMFS: A Contrastive Learning enhanced Large Language Model Framework for Few-Shot Named Entity Recognition [3.695767900907561]
CLLMFS is a Contrastive Learning enhanced Large Language Model framework for Few-Shot Named Entity Recognition.
It integrates Low-Rank Adaptation (LoRA) and contrastive learning mechanisms specifically tailored for few-shot NER.
Our method has achieved state-of-the-art performance improvements on F1-score ranging from 2.58% to 97.74% over existing best-performing methods.
arXiv Detail & Related papers (2024-08-23T04:44:05Z) - Logit Separability-Driven Samples and Multiple Class-Related Words Selection for Advancing In-Context Learning [0.0]
We introduce logit separability, a criterion to assess the clarity of both samples and class-related words at the logit level.
We find that incorporating multiple class-related words for each sample, rather than relying on a single class name, improves performance by offering a broader range of label information.
We propose LICL, a logit separability-based method that jointly organizes samples and integrates multiple class-related words into each sample-label pair.
arXiv Detail & Related papers (2024-06-16T12:11:46Z) - Rectifying Demonstration Shortcut in In-Context Learning [15.08431909212102]
Large language models (LLMs) are able to solve various tasks with only a few demonstrations utilizing their in-context learning (ICL) abilities.
LLMs often rely on their pre-trained semantic priors of demonstrations rather than on the input-label relationships to proceed with ICL prediction.
arXiv Detail & Related papers (2024-03-14T15:30:14Z) - In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER.
We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever.
In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z) - Identifying and Analyzing Performance-Critical Tokens in Large Language Models [52.404072802235234]
We study how large language models learn to perform tasks from demonstrations.<n>Our work sheds light on how large language models learn to perform tasks from demonstrations and deepens our understanding of the roles different types of tokens play in large language models.
arXiv Detail & Related papers (2024-01-20T20:55:21Z) - Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models [16.16724411695959]
This work pushes the performance boundary of zero-shot NER with powerful large language models (LLMs)
We propose a training-free self-improving framework, which utilizes an unlabeled corpus to stimulate the self-learning ability of LLMs.
Experiments on four benchmarks show substantial performance improvements achieved by our framework.
arXiv Detail & Related papers (2023-11-15T12:47:52Z) - Channel-Wise Contrastive Learning for Learning with Noisy Labels [60.46434734808148]
We introduce channel-wise contrastive learning (CWCL) to distinguish authentic label information from noise.
Unlike conventional instance-wise contrastive learning (IWCL), CWCL tends to yield more nuanced and resilient features aligned with the authentic labels.
Our strategy is twofold: firstly, using CWCL to extract pertinent features to identify cleanly labeled samples, and secondly, progressively fine-tuning using these samples.
arXiv Detail & Related papers (2023-08-14T06:04:50Z) - Disambiguation of Company names via Deep Recurrent Networks [101.90357454833845]
We propose a Siamese LSTM Network approach to extract -- via supervised learning -- an embedding of company name strings.
We analyse how an Active Learning approach to prioritise the samples to be labelled leads to a more efficient overall learning pipeline.
arXiv Detail & Related papers (2023-03-07T15:07:57Z) - Focusing on Potential Named Entities During Active Label Acquisition [0.0]
Named entity recognition (NER) aims to identify mentions of named entities in an unstructured text.
Many domain-specific NER applications still call for a substantial amount of labeled data.
We propose a better data-driven normalization approach to penalize sentences that are too long or too short.
arXiv Detail & Related papers (2021-11-06T09:04:16Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.