Related papers: Effect of depth order on iterative nested named entity recognition models

Effect of depth order on iterative nested named entity recognition models

URL: http://arxiv.org/abs/2104.01037v1
Date: Fri, 2 Apr 2021 13:18:52 GMT
Title: Effect of depth order on iterative nested named entity recognition models
Authors: Perceval Wajsburt, Yoann Taill\'e, Xavier Tannier
Abstract summary: We study the effect of the order of depth of mention on nested named entity recognition (NER) models. We design an order-agnostic iterative model and a procedure to choose a custom order during training and prediction. We show that the smallest to largest order gives the best results.
Score: 1.619995421534183
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper studies the effect of the order of depth of mention on nested named entity recognition (NER) models. NER is an essential task in the extraction of biomedical information, and nested entities are common since medical concepts can assemble to form larger entities. Conventional NER systems only predict disjointed entities. Thus, iterative models for nested NER use multiple predictions to enumerate all entities, imposing a predefined order from largest to smallest or smallest to largest. We design an order-agnostic iterative model and a procedure to choose a custom order during training and prediction. To accommodate for this task, we propose a modification of the Transformer architecture to take into account the entities predicted in the previous steps. We provide a set of experiments to study the model's capabilities and the effects of the order on performance. Finally, we show that the smallest to largest order gives the best results.

Related papers

A Collaborative Ensemble Framework for CTR Prediction [73.59868761656317]
We propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models. Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning. We validate our framework on three public datasets and a large-scale industrial dataset from Meta.
arXiv Detail & Related papers (2024-11-20T20:38:56Z)
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling. Our research explores task-specific model pruning to inform decisions about designing SMoE architectures. We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z)
ToNER: Type-oriented Named Entity Recognition with Generative Language Model [14.11486479935094]
We propose a novel NER framework, namely ToNER based on a generative model. In ToNER, a type matching model is proposed at first to identify the entity types most likely to appear in the sentence. We append a multiple binary classification task to fine-tune the generative model's encoder, so as to generate the refined representation of the input sentence.
arXiv Detail & Related papers (2024-04-14T05:13:37Z)
Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning [32.62763647036567]
Few-shot named entity recognition can identify new types of named entities based on a few labeled examples. We propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER) MsFNER splits the general NER into two stages: entity-span detection and entity classification.
arXiv Detail & Related papers (2024-04-10T12:31:09Z)
Enhancing Few-shot NER with Prompt Ordering based Data Augmentation [59.69108119752584]
We propose a Prompt Ordering based Data Augmentation (PODA) method to improve the training of unified autoregressive generation frameworks. Experimental results on three public NER datasets and further analyses demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-05-19T16:25:43Z)
Multi-task Transformer with Relation-attention and Type-attention for Named Entity Recognition [35.44123819012004]
Named entity recognition (NER) is an important research problem in natural language processing. This paper proposes a multi-task Transformer, which incorporates an entity boundary detection task into the named entity recognition task.
arXiv Detail & Related papers (2023-03-20T05:11:22Z)
Sequence-to-Set Generative Models [9.525560801277903]
We propose a sequence-to-set method to transform any sequence generative model into a set generative model. We present GRU2Set, which is an instance of our sequence-to-set method and employs the famous GRU model as the sequence generative model. A direct application of our models is to learn an order/set distribution from a collection of e-commerce orders.
arXiv Detail & Related papers (2022-09-19T07:13:51Z)
Nested Named Entity Recognition as Holistic Structure Parsing [92.8397338250383]
This work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all. Experiments show that our model yields promising results on widely-used benchmarks which approach or even achieve state-of-the-art.
arXiv Detail & Related papers (2022-04-17T12:48:20Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
Application of Pre-training Models in Named Entity Recognition [5.285449619478964]
We introduce the architecture and pre-training tasks of four common pre-training models: BERT, ERNIE, ERNIE2.0-tiny, and RoBERTa. We apply these pre-training models to a NER task by fine-tuning, and compare the effects of the different model architecture and pre-training tasks on the NER task. Experiment results showed that RoBERTa achieved state-of-the-art results on the MSRA-2006 dataset.
arXiv Detail & Related papers (2020-02-09T08:18:20Z)
Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.