Parallel sequence tagging for concept recognition
- URL: http://arxiv.org/abs/2003.07424v2
- Date: Sat, 8 Aug 2020 07:54:17 GMT
- Title: Parallel sequence tagging for concept recognition
- Authors: Lenz Furrer (1 and 3), Joseph Cornelius (1), Fabio Rinaldi (1, 2, and
3) ((1) University of Zurich, Switzerland, (2) Dalle Molle Institute for
Artificial Intelligence Research (IDSIA), Switzerland, (3) Swiss Institute of
Bioinformatics, Switzerland)
- Abstract summary: Named Entity Recognition (NER) and Normalisation (NEN) are core components of any text-mining system for biomedical texts.
In a traditional concept-recognition pipeline, these tasks are combined in a serial way, which is inherently prone to error propagation from NER to NEN.
We propose a parallel architecture, where both NER and NEN are modeled as a sequence-labeling task, operating directly on the source text.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Background: Named Entity Recognition (NER) and Normalisation (NEN) are core
components of any text-mining system for biomedical texts. In a traditional
concept-recognition pipeline, these tasks are combined in a serial way, which
is inherently prone to error propagation from NER to NEN. We propose a parallel
architecture, where both NER and NEN are modeled as a sequence-labeling task,
operating directly on the source text. We examine different harmonisation
strategies for merging the predictions of the two classifiers into a single
output sequence. Results: We test our approach on the recent Version 4 of the
CRAFT corpus. In all 20 annotation sets of the concept-annotation task, our
system outperforms the pipeline system reported as a baseline in the CRAFT
shared task 2019. Conclusions: Our analysis shows that the strengths of the two
classifiers can be combined in a fruitful way. However, prediction
harmonisation requires individual calibration on a development set for each
annotation set. This allows achieving a good trade-off between established
knowledge (training set) and novel information (unseen concepts). Availability
and Implementation: Source code freely available for download at
https://github.com/OntoGene/craft-st. Supplementary data are available at arXiv
online.
Related papers
- Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Enhancing Few-shot NER with Prompt Ordering based Data Augmentation [59.69108119752584]
We propose a Prompt Ordering based Data Augmentation (PODA) method to improve the training of unified autoregressive generation frameworks.
Experimental results on three public NER datasets and further analyses demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-05-19T16:25:43Z) - Hybrid Rule-Neural Coreference Resolution System based on Actor-Critic
Learning [53.73316523766183]
Coreference resolution systems need to tackle two main tasks.
One task is to detect all of the potential mentions, the other is to learn the linking of an antecedent for each possible mention.
We propose a hybrid rule-neural coreference resolution system based on actor-critic learning.
arXiv Detail & Related papers (2022-12-20T08:55:47Z) - Neural Coreference Resolution based on Reinforcement Learning [53.73316523766183]
Coreference resolution systems need to solve two subtasks.
One task is to detect all of the potential mentions, the other is to learn the linking of an antecedent for each possible mention.
We propose a reinforcement learning actor-critic-based neural coreference resolution system.
arXiv Detail & Related papers (2022-12-18T07:36:35Z) - Knowledge Graph Generation From Text [18.989264255589806]
We propose a novel end-to-end Knowledge Graph (KG) generation system from textual inputs.
The graph nodes are generated first using pretrained language model, followed by a simple edge construction head.
We evaluated the model on a recent WebNLG 2020 Challenge dataset, matching the state-of-the-art performance on text-to-RDF generation task.
arXiv Detail & Related papers (2022-11-18T21:27:13Z) - Training Free Graph Neural Networks for Graph Matching [103.45755859119035]
TFGM is a framework to boost the performance of Graph Neural Networks (GNNs) based graph matching without training.
Applying TFGM on various GNNs shows promising improvements over baselines.
arXiv Detail & Related papers (2022-01-14T09:04:46Z) - Boosting Span-based Joint Entity and Relation Extraction via Squence
Tagging Mechanism [10.894755638322]
Span-based joint extraction simultaneously conducts named entity recognition (NER) and relation extraction (RE) in text span form.
Recent studies have shown that token labels can convey crucial task-specific information and enrich token semantics.
We pro-pose Sequence Tagging enhanced Span-based Network (STSN), a span-based joint extrac-tion network that is enhanced by token BIO label information.
arXiv Detail & Related papers (2021-05-21T01:10:03Z) - A Sequence-to-Set Network for Nested Named Entity Recognition [38.05786148160635]
We propose a novel sequence-to-set neural network for nested NER.
We use a non-autoregressive decoder to predict the final set of entities in one pass.
Experimental results show that our proposed model achieves state-of-the-art on three nested NER corpora.
arXiv Detail & Related papers (2021-05-19T03:10:04Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.