UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual
Embeddings Using the Unified Medical Language System Metathesaurus
- URL: http://arxiv.org/abs/2010.10391v5
- Date: Thu, 3 Jun 2021 15:07:58 GMT
- Title: UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual
Embeddings Using the Unified Medical Language System Metathesaurus
- Authors: George Michalopoulos, Yuanxin Wang, Hussam Kaka, Helen Chen and
Alexander Wong
- Abstract summary: We introduce UmlsBERT, a contextual embedding model that integrates domain knowledge during the pre-training process.
By applying these two strategies, UmlsBERT can encode clinical domain knowledge into word embeddings and outperform existing domain-specific models.
- Score: 73.86656026386038
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contextual word embedding models, such as BioBERT and Bio_ClinicalBERT, have
achieved state-of-the-art results in biomedical natural language processing
tasks by focusing their pre-training process on domain-specific corpora.
However, such models do not take into consideration expert domain knowledge.
In this work, we introduced UmlsBERT, a contextual embedding model that
integrates domain knowledge during the pre-training process via a novel
knowledge augmentation strategy. More specifically, the augmentation on
UmlsBERT with the Unified Medical Language System (UMLS) Metathesaurus was
performed in two ways: i) connecting words that have the same underlying
`concept' in UMLS, and ii) leveraging semantic group knowledge in UMLS to
create clinically meaningful input embeddings. By applying these two
strategies, UmlsBERT can encode clinical domain knowledge into word embeddings
and outperform existing domain-specific models on common named-entity
recognition (NER) and clinical natural language inference clinical NLP tasks.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.