Recognising Biomedical Names: Challenges and Solutions
- URL: http://arxiv.org/abs/2106.12230v1
- Date: Wed, 23 Jun 2021 08:20:13 GMT
- Title: Recognising Biomedical Names: Challenges and Solutions
- Authors: Xiang Dai
- Abstract summary: We propose a transition-based NER model which can recognise discontinuous mentions.
We also develop a cost-effective approach that nominates the suitable pre-training data.
Our contributions have obvious practical implications, especially when new biomedical applications are needed.
- Score: 9.51284672475743
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The growth rate in the amount of biomedical documents is staggering.
Unlocking information trapped in these documents can enable researchers and
practitioners to operate confidently in the information world. Biomedical NER,
the task of recognising biomedical names, is usually employed as the first step
of the NLP pipeline. Standard NER models, based on sequence tagging technique,
are good at recognising short entity mentions in the generic domain. However,
there are several open challenges of applying these models to recognise
biomedical names: 1) Biomedical names may contain complex inner structure
(discontinuity and overlapping) which cannot be recognised using standard
sequence tagging technique; 2) The training of NER models usually requires
large amount of labelled data, which are difficult to obtain in the biomedical
domain; and, 3) Commonly used language representation models are pre-trained on
generic data; a domain shift therefore exists between these models and target
biomedical data. To deal with these challenges, we explore several research
directions and make the following contributions: 1) we propose a
transition-based NER model which can recognise discontinuous mentions; 2) We
develop a cost-effective approach that nominates the suitable pre-training
data; and, 3) We design several data augmentation methods for NER. Our
contributions have obvious practical implications, especially when new
biomedical applications are needed. Our proposed data augmentation methods can
help the NER model achieve decent performance, requiring only a small amount of
labelled data. Our investigation regarding selecting pre-training data can
improve the model by incorporating language representation models, which are
pre-trained using in-domain data. Finally, our proposed transition-based NER
model can further improve the performance by recognising discontinuous
mentions.
Related papers
- Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Multi-level biomedical NER through multi-granularity embeddings and
enhanced labeling [3.8599767910528917]
This paper proposes a hybrid approach that integrates the strengths of multiple models.
BERT provides contextualized word embeddings, a pre-trained multi-channel CNN for character-level information capture, and following by a BiLSTM + CRF for sequence labelling and modelling dependencies between the words in the text.
We evaluate our model on the benchmark i2b2/2010 dataset, achieving an F1-score of 90.11.
arXiv Detail & Related papers (2023-12-24T21:45:36Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - From Zero to Hero: Harnessing Transformers for Biomedical Named Entity Recognition in Zero- and Few-shot Contexts [0.0]
This paper proposes a method for zero- and few-shot NER in the biomedical domain.
We have achieved average F1 scores of 35.44% for zero-shot NER, 50.10% for one-shot NER, 69.94% for 10-shot NER, and 79.51% for 100-shot NER on 9 diverse evaluated biomedical entities.
arXiv Detail & Related papers (2023-05-05T12:14:22Z) - Fine-Tuning Large Neural Language Models for Biomedical Natural Language
Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP.
We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains.
We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z) - Biomedical Interpretable Entity Representations [40.6095537182194]
Pre-trained language models induce dense entity representations that offer strong performance on entity-centric NLP tasks.
This can be a barrier to model uptake in important domains such as biomedicine.
We create a new entity type system and training set from a large corpus of biomedical texts.
arXiv Detail & Related papers (2021-06-17T13:52:10Z) - How Do Your Biomedical Named Entity Models Generalize to Novel Entities? [17.83980569600546]
We analyze the three types of recognition abilities of BioNER models: memorization, synonym generalization, and concept generalization.
We find that (1) BioNER models are overestimated in terms of their generalization ability, and (2) they tend to exploit dataset biases, which hinders the models' abilities to generalize.
Our method consistently improves the generalizability of the state-of-the-art (SOTA) models on five benchmark datasets, allowing them to better perform on unseen entity mentions.
arXiv Detail & Related papers (2021-01-01T04:13:42Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - BioALBERT: A Simple and Effective Pre-trained Language Model for
Biomedical Named Entity Recognition [9.05154470433578]
Existing BioNER approaches often neglect these issues and directly adopt the state-of-the-art (SOTA) models.
We propose biomedical ALBERT, an effective domain-specific language model trained on large-scale biomedical corpora.
arXiv Detail & Related papers (2020-09-19T12:58:47Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.