MED-SE: Medical Entity Definition-based Sentence Embedding
- URL: http://arxiv.org/abs/2212.04734v1
- Date: Fri, 9 Dec 2022 09:10:19 GMT
- Title: MED-SE: Medical Entity Definition-based Sentence Embedding
- Authors: Hyeonbin Hwang, Haanju Yoo, Yera Choi
- Abstract summary: We propose a novel unsupervised contrastive learning framework designed for clinical texts, which exploits the definitions of medical entities.
In the entity-centric setting that we have designed, MED-SE achieves significantly better performance, while the existing unsupervised methods including SimCSE show degraded performance.
- Score: 1.0828616610785524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Medical Entity Definition-based Sentence Embedding (MED-SE), a
novel unsupervised contrastive learning framework designed for clinical texts,
which exploits the definitions of medical entities. To this end, we conduct an
extensive analysis of multiple sentence embedding techniques in clinical
semantic textual similarity (STS) settings. In the entity-centric setting that
we have designed, MED-SE achieves significantly better performance, while the
existing unsupervised methods including SimCSE show degraded performance. Our
experiments elucidate the inherent discrepancies between the general- and
clinical-domain texts, and suggest that entity-centric contrastive approaches
may help bridge this gap and lead to a better representation of clinical
sentences.
Related papers
- Towards Multi-dimensional Explanation Alignment for Medical Classification [16.799101204390457]
We propose a novel framework called Med-MICN (Medical Multi-dimensional Interpretable Concept Network)
Med-MICN provides interpretability alignment for various angles, including neural symbolic reasoning, concept semantics, and saliency maps.
Its advantages include high prediction accuracy, interpretability across multiple dimensions, and automation through an end-to-end concept labeling process.
arXiv Detail & Related papers (2024-10-28T20:03:19Z) - Efficient Biomedical Entity Linking: Clinical Text Standardization with Low-Resource Techniques [0.0]
Multiple terms can refer to the same core concepts which can be referred as a clinical entity.
Ontologies like the Unified Medical Language System (UMLS) are developed and maintained to store millions of clinical entities.
We propose a suite of context-based and context-less remention techniques for performing the entity disambiguation.
arXiv Detail & Related papers (2024-05-24T01:14:33Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - Towards Semi-Structured Automatic ICD Coding via Tree-based Contrastive
Learning [18.380293890624102]
We investigate the semi-structured nature of clinical notes and propose an automatic algorithm to segment them into sections.
To address the variability issues in existing ICD coding models with limited data, we introduce a contrastive pre-training approach on sections.
arXiv Detail & Related papers (2023-10-14T22:07:13Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - Clinical Named Entity Recognition using Contextualized Token
Representations [49.036805795072645]
This paper introduces the technique of contextualized word embedding to better capture the semantic meaning of each word based on its context.
We pre-train two deep contextualized language models, Clinical Embeddings from Language Model (C-ELMo) and Clinical Contextual String Embeddings (C-Flair)
Explicit experiments show that our models gain dramatic improvements compared to both static word embeddings and domain-generic language models.
arXiv Detail & Related papers (2021-06-23T18:12:58Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - A Practical Approach towards Causality Mining in Clinical Text using
Active Transfer Learning [2.6125458645126907]
Causality mining is an active research area, which requires the application of state-of-the-art natural language processing techniques.
This research work is to create a framework, which can convert clinical text into causal knowledge.
arXiv Detail & Related papers (2020-12-10T06:51:13Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.