Improved Biomedical Word Embeddings in the Transformer Era
- URL: http://arxiv.org/abs/2012.11808v2
- Date: Thu, 24 Dec 2020 16:19:09 GMT
- Title: Improved Biomedical Word Embeddings in the Transformer Era
- Authors: Jiho Noh, Ramakanth Kavuluru
- Abstract summary: We learn word and concept embeddings by first using the skip-gram method and further fine-tuning them with correlational information.
We conduct evaluations of these tuned static embeddings using multiple datasets for word relatedness developed by previous efforts.
- Score: 2.978663539080876
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Biomedical word embeddings are usually pre-trained on free text corpora with
neural methods that capture local and global distributional properties. They
are leveraged in downstream tasks using various neural architectures that are
designed to optimize task-specific objectives that might further tune such
embeddings. Since 2018, however, there is a marked shift from these static
embeddings to contextual embeddings motivated by language models (e.g., ELMo,
transformers such as BERT, and ULMFiT). These dynamic embeddings have the added
benefit of being able to distinguish homonyms and acronyms given their context.
However, static embeddings are still relevant in low resource settings (e.g.,
smart devices, IoT elements) and to study lexical semantics from a
computational linguistics perspective. In this paper, we jointly learn word and
concept embeddings by first using the skip-gram method and further fine-tuning
them with correlational information manifesting in co-occurring Medical Subject
Heading (MeSH) concepts in biomedical citations. This fine-tuning is
accomplished with the BERT transformer architecture in the two-sentence input
mode with a classification objective that captures MeSH pair co-occurrence. In
essence, we repurpose a transformer architecture (typically used to generate
dynamic embeddings) to improve static embeddings using concept correlations. We
conduct evaluations of these tuned static embeddings using multiple datasets
for word relatedness developed by previous efforts. Without selectively culling
concepts and terms (as was pursued by previous efforts), we believe we offer
the most exhaustive evaluation of static embeddings to date with clear
performance improvements across the board. We provide our code and embeddings
for public use for downstream applications and research endeavors:
https://github.com/bionlproc/BERT-CRel-Embeddings
Related papers
- Exploiting the Semantic Knowledge of Pre-trained Text-Encoders for Continual Learning [70.64617500380287]
Continual learning allows models to learn from new data while retaining previously learned knowledge.
The semantic knowledge available in the label information of the images, offers important semantic information that can be related with previously acquired knowledge of semantic classes.
We propose integrating semantic guidance within and across tasks by capturing semantic similarity using text embeddings.
arXiv Detail & Related papers (2024-08-02T07:51:44Z) - A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models.
We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z) - MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics
Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions.
This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z) - Word Sense Induction with Knowledge Distillation from BERT [6.88247391730482]
This paper proposes a method to distill multiple word senses from a pre-trained language model (BERT) by using attention over the senses of a word in a context.
Experiments on the contextual word similarity and sense induction tasks show that this method is superior to or competitive with state-of-the-art multi-sense embeddings.
arXiv Detail & Related papers (2023-04-20T21:05:35Z) - Combining Contrastive Learning and Knowledge Graph Embeddings to develop
medical word embeddings for the Italian language [0.0]
This paper attempts to improve available embeddings in the uncovered niche of the Italian medical domain.
The main objective is to improve the accuracy of semantic similarity between medical terms.
Since the Italian language lacks medical texts and controlled vocabularies, we have developed a specific solution.
arXiv Detail & Related papers (2022-11-09T17:12:28Z) - TransDrift: Modeling Word-Embedding Drift using Transformer [8.707217592903735]
We propose TransDrift, a transformer-based prediction model for word embeddings.
Our model accurately learns the dynamics of the embedding drift and predicts the future embedding.
Our embeddings lead to superior performance compared to the previous methods.
arXiv Detail & Related papers (2022-06-16T10:48:26Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - EDS-MEMBED: Multi-sense embeddings based on enhanced distributional
semantic structures via a graph walk over word senses [0.0]
We leverage the rich semantic structures in WordNet to enhance the quality of multi-sense embeddings.
We derive new distributional semantic similarity measures for M-SE from prior ones.
We report evaluation results on 11 benchmark datasets involving WSD and Word Similarity tasks.
arXiv Detail & Related papers (2021-02-27T14:36:55Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Distributional semantic modeling: a revised technique to train term/word
vector space models applying the ontology-related approach [36.248702416150124]
We design a new technique for the distributional semantic modeling with a neural network-based approach to learn distributed term representations (or term embeddings)
Vec2graph is a Python library for visualizing word embeddings (term embeddings in our case) as dynamic and interactive graphs.
arXiv Detail & Related papers (2020-03-06T18:27:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.