SpanProto: A Two-stage Span-based Prototypical Network for Few-shot
Named Entity Recognition
- URL: http://arxiv.org/abs/2210.09049v1
- Date: Mon, 17 Oct 2022 12:59:33 GMT
- Title: SpanProto: A Two-stage Span-based Prototypical Network for Few-shot
Named Entity Recognition
- Authors: Jianing Wang, Chengyu Wang, Chuanqi Tan, Minghui Qiu, Songfang Huang,
Jun Huang, Ming Gao
- Abstract summary: Few-shot Named Entity Recognition (NER) aims to identify named entities with very little annotated data.
We propose a seminal span-based prototypical network (SpanProto) that tackles few-shot NER via a two-stage approach.
In the span extraction stage, we transform the sequential tags into a global boundary matrix, enabling the model to focus on the explicit boundary information.
For mention classification, we leverage prototypical learning to capture the semantic representations for each labeled span and make the model better adapt to novel-class entities.
- Score: 45.012327072558975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot Named Entity Recognition (NER) aims to identify named entities with
very little annotated data. Previous methods solve this problem based on
token-wise classification, which ignores the information of entity boundaries,
and inevitably the performance is affected by the massive non-entity tokens. To
this end, we propose a seminal span-based prototypical network (SpanProto) that
tackles few-shot NER via a two-stage approach, including span extraction and
mention classification. In the span extraction stage, we transform the
sequential tags into a global boundary matrix, enabling the model to focus on
the explicit boundary information. For mention classification, we leverage
prototypical learning to capture the semantic representations for each labeled
span and make the model better adapt to novel-class entities. To further
improve the model performance, we split out the false positives generated by
the span extractor but not labeled in the current episode set, and then present
a margin-based loss to separate them from each prototype region. Experiments
over multiple benchmarks demonstrate that our model outperforms strong
baselines by a large margin.
Related papers
- Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning [32.62763647036567]
Few-shot named entity recognition can identify new types of named entities based on a few labeled examples.
We propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER)
MsFNER splits the general NER into two stages: entity-span detection and entity classification.
arXiv Detail & Related papers (2024-04-10T12:31:09Z) - PromptNER: A Prompting Method for Few-shot Named Entity Recognition via
k Nearest Neighbor Search [56.81939214465558]
We propose PromptNER: a novel prompting method for few-shot NER via k nearest neighbor search.
We use prompts that contains entity category information to construct label prototypes, which enables our model to fine-tune with only the support set.
Our approach achieves excellent transfer learning ability, and extensive experiments on the Few-NERD and CrossNER datasets demonstrate that our model achieves superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2023-05-20T15:47:59Z) - Decomposed Meta-Learning for Few-Shot Named Entity Recognition [32.515795881027074]
Few-shot named entity recognition (NER) systems aim at recognizing novel-class named entities based on only a few labeled examples.
We present a meta-learning approach which tackles few-shot span detection and few-shot entity typing using meta-learning.
arXiv Detail & Related papers (2022-04-12T12:46:23Z) - X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented
Compositional Semantic Parsing [51.81533991497547]
Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries.
We present X2 compared a transferable Cross-lingual and Cross-domain for TCSP.
We propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems.
arXiv Detail & Related papers (2021-06-07T16:40:05Z) - A Sequence-to-Set Network for Nested Named Entity Recognition [38.05786148160635]
We propose a novel sequence-to-set neural network for nested NER.
We use a non-autoregressive decoder to predict the final set of entities in one pass.
Experimental results show that our proposed model achieves state-of-the-art on three nested NER corpora.
arXiv Detail & Related papers (2021-05-19T03:10:04Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z) - Few-shot 3D Point Cloud Semantic Segmentation [138.80825169240302]
We propose a novel attention-aware multi-prototype transductive few-shot point cloud semantic segmentation method.
Our proposed method shows significant and consistent improvements compared to baselines in different few-shot point cloud semantic segmentation settings.
arXiv Detail & Related papers (2020-06-22T08:05:25Z) - Named Entity Recognition without Labelled Data: A Weak Supervision
Approach [23.05371427663683]
This paper presents a simple but powerful approach to learn NER models in the absence of labelled data through weak supervision.
The approach relies on a broad spectrum of labelling functions to automatically annotate texts from the target domain.
A sequence labelling model can finally be trained on the basis of this unified annotation.
arXiv Detail & Related papers (2020-04-30T12:29:55Z) - Zero-Resource Cross-Domain Named Entity Recognition [68.83177074227598]
Existing models for cross-domain named entity recognition rely on numerous unlabeled corpus or labeled NER training data in target domains.
We propose a cross-domain NER model that does not use any external resources.
arXiv Detail & Related papers (2020-02-14T09:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.