Quantification of BERT Diagnosis Generalizability Across Medical
Specialties Using Semantic Dataset Distance
- URL: http://arxiv.org/abs/2008.06606v3
- Date: Fri, 19 Feb 2021 06:58:18 GMT
- Title: Quantification of BERT Diagnosis Generalizability Across Medical
Specialties Using Semantic Dataset Distance
- Authors: Mihir P. Khambete, William Su, Juan Garcia, Marcus A. Badgeley
- Abstract summary: Deep learning models in healthcare may fail to generalize on data from unseen corpora.
No metric exists to tell how existing models will perform on new data.
Model performance on new corpora is directly correlated to the similarity between train and test sentence content.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models in healthcare may fail to generalize on data from unseen
corpora. Additionally, no quantitative metric exists to tell how existing
models will perform on new data. Previous studies demonstrated that NLP models
of medical notes generalize variably between institutions, but ignored other
levels of healthcare organization. We measured SciBERT diagnosis sentiment
classifier generalizability between medical specialties using EHR sentences
from MIMIC-III. Models trained on one specialty performed better on internal
test sets than mixed or external test sets (mean AUCs 0.92, 0.87, and 0.83,
respectively; p = 0.016). When models are trained on more specialties, they
have better test performances (p < 1e-4). Model performance on new corpora is
directly correlated to the similarity between train and test sentence content
(p < 1e-4). Future studies should assess additional axes of generalization to
ensure deep learning models fulfil their intended purpose across institutions,
specialties, and practices.
Related papers
- Weakly supervised deep learning model with size constraint for prostate cancer detection in multiparametric MRI and generalization to unseen domains [0.90668179713299]
We show that the model achieves on-par performance with strong fully supervised baseline models.
We also observe a performance decrease for both fully supervised and weakly supervised models when tested on unseen data domains.
arXiv Detail & Related papers (2024-11-04T12:24:33Z) - Repurposing Foundation Model for Generalizable Medical Time Series Classification [16.21546283978257]
FORMED is a foundation classification model that leverages a pre-trained backbone.
It can adapt seamlessly to unseen MedTS datasets, regardless of the number of channels, sample lengths, or medical tasks.
Our results highlight FORMED as a versatile and scalable model for a wide range of MedTS classification tasks, positioning it as a strong foundation model for future research in MedTS analysis.
arXiv Detail & Related papers (2024-10-03T23:50:04Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Federated Learning of Medical Concepts Embedding using BEHRT [0.0]
We propose a federated learning approach for learning medical concepts embedding.
Our approach is based on embedding model like BEHRT, a deep neural sequence model for EHR.
We compare the performance of a model trained with FL against a model trained on centralized data.
arXiv Detail & Related papers (2023-05-22T14:05:39Z) - Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review
and Replicability Study [60.56194508762205]
We reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models.
We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation.
We present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models.
arXiv Detail & Related papers (2023-04-21T11:54:44Z) - A Cross-institutional Evaluation on Breast Cancer Phenotyping NLP
Algorithms on Electronic Health Records [19.824923994227202]
We developed three types of NLP models to extract cancer phenotypes from clinical texts.
The models were evaluated for their generalizability on different test sets with different learning strategies.
The CancerBERT model developed in one institute and further fine-tuned in another institute achieved reasonable performance.
arXiv Detail & Related papers (2023-03-15T08:44:07Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - MIMO: Mutual Integration of Patient Journey and Medical Ontology for
Healthcare Representation Learning [49.57261599776167]
We propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics.
arXiv Detail & Related papers (2021-07-20T07:04:52Z) - Pre-training transformer-based framework on large-scale pediatric claims
data for downstream population-specific tasks [3.1580072841682734]
This study presents the Claim Pre-Training (Claim-PT) framework, a generic pre-training model that first trains on the entire pediatric claims dataset.
The effective knowledge transfer is completed through the task-aware fine-tuning stage.
We conducted experiments on a real-world claims dataset with more than one million patient records.
arXiv Detail & Related papers (2021-06-24T15:25:41Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.