A Cooperative Multi-Agent Framework for Zero-Shot Named Entity Recognition
- URL: http://arxiv.org/abs/2502.18702v1
- Date: Tue, 25 Feb 2025 23:30:43 GMT
- Title: A Cooperative Multi-Agent Framework for Zero-Shot Named Entity Recognition
- Authors: Zihan Wang, Ziqi Zhao, Yougang Lyu, Zhumin Chen, Maarten de Rijke, Zhaochun Ren,
- Abstract summary: Zero-shot named entity recognition (NER) aims to develop entity recognition systems from unannotated text corpora.<n>Recent work has adapted large language models (LLMs) for zero-shot NER by crafting specialized prompt templates.<n>We introduce the cooperative multi-agent system (CMAS), a novel framework for zero-shot NER.
- Score: 71.61103962200666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot named entity recognition (NER) aims to develop entity recognition systems from unannotated text corpora. This task presents substantial challenges due to minimal human intervention. Recent work has adapted large language models (LLMs) for zero-shot NER by crafting specialized prompt templates. It advances model self-learning abilities by incorporating self-annotated demonstrations. However, two important challenges persist: (i) Correlations between contexts surrounding entities are overlooked, leading to wrong type predictions or entity omissions. (ii) The indiscriminate use of task demonstrations, retrieved through shallow similarity-based strategies, severely misleads LLMs during inference. In this paper, we introduce the cooperative multi-agent system (CMAS), a novel framework for zero-shot NER that uses the collective intelligence of multiple agents to address the challenges outlined above. CMAS has four main agents: (i) a self-annotator, (ii) a type-related feature (TRF) extractor, (iii) a demonstration discriminator, and (iv) an overall predictor. To explicitly capture correlations between contexts surrounding entities, CMAS reformulates NER into two subtasks: recognizing named entities and identifying entity type-related features within the target sentence. To enable controllable utilization of demonstrations, a demonstration discriminator is established to incorporate the self-reflection mechanism, automatically evaluating helpfulness scores for the target sentence. Experimental results show that CMAS significantly improves zero-shot NER performance across six benchmarks, including both domain-specific and general-domain scenarios. Furthermore, CMAS demonstrates its effectiveness in few-shot settings and with various LLM backbones.
Related papers
- Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks.
Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval.
This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z) - Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards [1.179778723980276]
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for sequential decision-making and control tasks.
The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals.
We propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies.
arXiv Detail & Related papers (2024-08-12T21:38:40Z) - An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning [55.265138447400744]
Statement-Tuning is a technique that models discriminative tasks as a set of finite statements and trains an encoder model to discriminate between the potential statements to determine the label.
Experimental results demonstrate that Statement-Tuning achieves competitive performance compared to state-of-the-art LLMs with significantly fewer parameters.
The study investigates the impact of several design choices on few-shot and zero-shot generalization, revealing that Statement-Tuning can achieve strong performance with modest training data.
arXiv Detail & Related papers (2024-04-19T14:05:03Z) - In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER.
We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever.
In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z) - Robust Few-Shot Named Entity Recognition with Boundary Discrimination
and Correlation Purification [14.998158107063848]
Few-shot named entity recognition (NER) aims to recognize novel named entities in low-resource domains utilizing existing knowledge.
We propose a robust two-stage few-shot NER method with Boundary Discrimination and Correlation Purification (BDCP)
In the span detection stage, the entity boundary discriminative module is introduced to provide a highly distinguishing boundary representation space to detect entity spans.
In the entity typing stage, the correlations between entities and contexts are purified by minimizing the interference information.
arXiv Detail & Related papers (2023-12-13T08:17:00Z) - Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models [16.16724411695959]
This work pushes the performance boundary of zero-shot NER with powerful large language models (LLMs)
We propose a training-free self-improving framework, which utilizes an unlabeled corpus to stimulate the self-learning ability of LLMs.
Experiments on four benchmarks show substantial performance improvements achieved by our framework.
arXiv Detail & Related papers (2023-11-15T12:47:52Z) - Empirical Study of Zero-Shot NER with ChatGPT [19.534329209433626]
Large language models (LLMs) exhibited powerful capability in various natural language processing tasks.
This work focuses on exploring LLM performance on zero-shot information extraction.
Inspired by the remarkable reasoning capability of LLM on symbolic and arithmetic reasoning, we adapt the prevalent reasoning methods to NER.
arXiv Detail & Related papers (2023-10-16T03:40:03Z) - A Multi-Task Semantic Decomposition Framework with Task-specific
Pre-training for Few-Shot NER [26.008350261239617]
We propose a Multi-Task Semantic Decomposition Framework via Joint Task-specific Pre-training for few-shot NER.
We introduce two novel pre-training tasks: Demonstration-based Masked Language Modeling (MLM) and Class Contrastive Discrimination.
In the downstream main task, we introduce a multi-task joint optimization framework with the semantic decomposing method, which facilitates the model to integrate two different semantic information for entity classification.
arXiv Detail & Related papers (2023-08-28T12:46:21Z) - Multi-task Transformer with Relation-attention and Type-attention for
Named Entity Recognition [35.44123819012004]
Named entity recognition (NER) is an important research problem in natural language processing.
This paper proposes a multi-task Transformer, which incorporates an entity boundary detection task into the named entity recognition task.
arXiv Detail & Related papers (2023-03-20T05:11:22Z) - Word Sense Induction with Hierarchical Clustering and Mutual Information
Maximization [14.997937028599255]
Word sense induction is a difficult problem in natural language processing.
We propose a novel unsupervised method based on hierarchical clustering and invariant information clustering.
We empirically demonstrate that, in certain cases, our approach outperforms prior WSI state-of-the-art methods.
arXiv Detail & Related papers (2022-10-11T13:04:06Z) - Randomized Entity-wise Factorization for Multi-Agent Reinforcement
Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities.
Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.