NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy
Channel for Robust Pharmacological Entity Detection
- URL: http://arxiv.org/abs/2007.01022v1
- Date: Thu, 2 Jul 2020 11:17:16 GMT
- Title: NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy
Channel for Robust Pharmacological Entity Detection
- Authors: Lukas Lange, Heike Adel, Jannik Str\"otgen
- Abstract summary: We describe the system with which we participated in the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared Tasks 2019.
Our system achieves promising results, especially by combining the different techniques, and reaches up to 88.6% F1 in the competition.
- Score: 11.98821166621488
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named entity recognition has been extensively studied on English news texts.
However, the transfer to other domains and languages is still a challenging
problem. In this paper, we describe the system with which we participated in
the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared
Tasks 2019. Aiming at pharmacological entity detection in Spanish texts, the
task provides a non-standard domain and language setting. However, we propose
an architecture that requires neither language nor domain expertise. We treat
the task as a sequence labeling task and experiment with attention-based
embedding selection and the training on automatically annotated data to further
improve our system's performance. Our system achieves promising results,
especially by combining the different techniques, and reaches up to 88.6% F1 in
the competition.
Related papers
- IITK at SemEval-2024 Task 1: Contrastive Learning and Autoencoders for Semantic Textual Relatedness in Multilingual Texts [4.78482610709922]
This paper describes our system developed for the SemEval-2024 Task 1: Semantic Textual Relatedness.
The challenge is focused on automatically detecting the degree of relatedness between pairs of sentences for 14 languages.
arXiv Detail & Related papers (2024-04-06T05:58:42Z) - DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System
for Multilingual Named Entity Recognition [94.90258603217008]
The MultiCoNER RNum2 shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios.
Previous top systems in the MultiCoNER RNum1 either incorporate the knowledge bases or gazetteers.
We propose a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER.
arXiv Detail & Related papers (2023-05-05T16:59:26Z) - IXA/Cogcomp at SemEval-2023 Task 2: Context-enriched Multilingual Named
Entity Recognition using Knowledge Bases [53.054598423181844]
We present a novel NER cascade approach comprising three steps.
We empirically demonstrate the significance of external knowledge bases in accurately classifying fine-grained and emerging entities.
Our system exhibits robust performance in the MultiCoNER2 shared task, even in the low-resource language setting.
arXiv Detail & Related papers (2023-04-20T20:30:34Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Addressing Issues of Cross-Linguality in Open-Retrieval Question
Answering Systems For Emergent Domains [67.99403521976058]
We demonstrate a cross-lingual open-retrieval question answering system for the emergent domain of COVID-19.
Our system adopts a corpus of scientific articles to ensure that retrieved documents are reliable.
We show that a deep semantic retriever greatly benefits from training on our English-to-all data and significantly outperforms a BM25 baseline in the cross-lingual setting.
arXiv Detail & Related papers (2022-01-26T19:27:32Z) - Chemical Identification and Indexing in PubMed Articles via BERT and
Text-to-Text Approaches [3.7462395049372894]
The Biocreative VII Track-2 challenge consists of named entity recognition, entity-linking (or entity-normalization), and topic indexing tasks.
We achieve our best performance with BERT-based BioMegatron models.
In addition to conventional NER methods, we attempt both named entity recognition and entity linking with a novel text-to-text or "prompt" based method.
arXiv Detail & Related papers (2021-11-30T18:21:06Z) - Learning Domain-Specialised Representations for Cross-Lingual Biomedical
Entity Linking [66.76141128555099]
We propose a novel cross-lingual biomedical entity linking task (XL-BEL)
We first investigate the ability of standard knowledge-agnostic as well as knowledge-enhanced monolingual and multilingual LMs beyond the standard monolingual English BEL task.
We then address the challenge of transferring domain-specific knowledge in resource-rich languages to resource-poor ones.
arXiv Detail & Related papers (2021-05-30T00:50:00Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z) - NLNDE: The Neither-Language-Nor-Domain-Experts' Way of Spanish Medical
Document De-Identification [11.98821166621488]
We describe our NLNDE system, with which we participated in the MEDDOCAN competition.
We address the task of detecting and classifying protected health information from Spanish data.
Despite dealing in a non-standard language and domain setting, the NLNDE system achieves promising results in the competition.
arXiv Detail & Related papers (2020-07-02T11:30:32Z) - Data Mining in Clinical Trial Text: Transformers for Classification and
Question Answering Tasks [2.127049691404299]
This research applies advances in natural language processing to evidence synthesis based on medical texts.
The main focus is on information characterized via the Population, Intervention, Comparator, and Outcome (PICO) framework.
Recent neural network architectures based on transformers show capacities for transfer learning and increased performance on downstream natural language processing tasks.
arXiv Detail & Related papers (2020-01-30T11:45:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.