Unifying Heterogenous Electronic Health Records Systems via Text-Based
Code Embedding
- URL: http://arxiv.org/abs/2111.09098v1
- Date: Fri, 12 Nov 2021 20:27:55 GMT
- Title: Unifying Heterogenous Electronic Health Records Systems via Text-Based
Code Embedding
- Authors: Kyunghoon Hur, Jiyoung Lee, Jungwoo Oh, Wesley Price, Young-Hak Kim,
Edward Choi
- Abstract summary: We introduceDescription-based Embedding,DescEmb, a code-agnostic representation learning framework for EHR.
DescEmb takes advantage of the flexibil-ity of neural language understanding models toembed clinical events using their textual descrip-tions rather than directly mapping each event to a dedicated embedding.
- Score: 7.3394352452936085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: EHR systems lack a unified code system forrepresenting medical concepts,
which acts asa barrier for the deployment of deep learningmodels in large scale
to multiple clinics and hos-pitals. To overcome this problem, we
introduceDescription-based Embedding,DescEmb, a code-agnostic representation
learning framework forEHR. DescEmb takes advantage of the flexibil-ity of
neural language understanding models toembed clinical events using their
textual descrip-tions rather than directly mapping each event toa dedicated
embedding. DescEmb outperformedtraditional code-based embedding in
extensiveexperiments, especially in a zero-shot transfertask (one hospital to
another), and was able totrain a single unified model for heterogeneousEHR
datasets.
Related papers
- Large Language Model in Medical Informatics: Direct Classification and Enhanced Text Representations for Automatic ICD Coding [7.0413463890126735]
This paper explores the use of Large Language Models (LLM), specifically the LLAMA architecture, to enhance ICD code classification.
We evaluate these methods by comparing them against state-of-the-art approaches.
arXiv Detail & Related papers (2024-11-11T09:31:46Z) - Medical Vision-Language Pre-Training for Brain Abnormalities [96.1408455065347]
We show how to automatically collect medical image-text aligned data for pretraining from public resources such as PubMed.
In particular, we present a pipeline that streamlines the pre-training process by initially collecting a large brain image-text dataset.
We also investigate the unique challenge of mapping subfigures to subcaptions in the medical domain.
arXiv Detail & Related papers (2024-04-27T05:03:42Z) - Emergency Department Decision Support using Clinical Pseudo-notes [0.4487265603408873]
We introduce the Multiple Embedding Model for EHR (MEME)
MEME serializes multimodal EHR data into text using pseudo-notes, mimicking clinical text generation.
We demonstrate the effectiveness of MEME by applying it to several decision support tasks within the Emergency Department across multiple hospital systems.
arXiv Detail & Related papers (2024-01-31T20:31:56Z) - Class Attention to Regions of Lesion for Imbalanced Medical Image
Recognition [59.28732531600606]
We propose a framework named textbfClass textbfAttention to textbfREgions of the lesion (CARE) to handle data imbalance issues.
The CARE framework needs bounding boxes to represent the lesion regions of rare diseases.
Results show that the CARE variants with automated bounding box generation are comparable to the original CARE framework.
arXiv Detail & Related papers (2023-07-19T15:19:02Z) - Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach.
Our approach is easy to integrate into any hybrid model and requires no external training data.
Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z) - Unifying Heterogenous Electronic Health Records Systems via Text-Based
Code Embedding [7.3394352452936085]
We introduce DescEmb, a code-agnostic description-based representation learning framework for predictive modeling on EHR.
We tested our model's capacity on various experiments including prediction tasks, transfer learning and pooled learning.
arXiv Detail & Related papers (2021-08-08T12:47:42Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - Does the Magic of BERT Apply to Medical Code Assignment? A Quantitative
Study [2.871614744079523]
It is not clear if pretrained models are useful for medical code prediction without further architecture engineering.
We propose a hierarchical fine-tuning architecture to capture interactions between distant words and adopt label-wise attention to exploit label information.
Contrary to current trends, we demonstrate that a carefully trained classical CNN outperforms attention-based models on a MIMIC-III subset with frequent codes.
arXiv Detail & Related papers (2021-03-11T07:23:45Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.