A Cross-institutional Evaluation on Breast Cancer Phenotyping NLP
Algorithms on Electronic Health Records
- URL: http://arxiv.org/abs/2303.08448v1
- Date: Wed, 15 Mar 2023 08:44:07 GMT
- Title: A Cross-institutional Evaluation on Breast Cancer Phenotyping NLP
Algorithms on Electronic Health Records
- Authors: Sicheng Zhou, Nan Wang, Liwei Wang, Ju Sun, Anne Blaes, Hongfang Liu,
Rui Zhang
- Abstract summary: We developed three types of NLP models to extract cancer phenotypes from clinical texts.
The models were evaluated for their generalizability on different test sets with different learning strategies.
The CancerBERT model developed in one institute and further fine-tuned in another institute achieved reasonable performance.
- Score: 19.824923994227202
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Objective: The generalizability of clinical large language models is usually
ignored during the model development process. This study evaluated the
generalizability of BERT-based clinical NLP models across different clinical
settings through a breast cancer phenotype extraction task.
Materials and Methods: Two clinical corpora of breast cancer patients were
collected from the electronic health records from the University of Minnesota
and the Mayo Clinic, and annotated following the same guideline. We developed
three types of NLP models (i.e., conditional random field, bi-directional long
short-term memory and CancerBERT) to extract cancer phenotypes from clinical
texts. The models were evaluated for their generalizability on different test
sets with different learning strategies (model transfer vs. locally trained).
The entity coverage score was assessed with their association with the model
performances.
Results: We manually annotated 200 and 161 clinical documents at UMN and MC,
respectively. The corpora of the two institutes were found to have higher
similarity between the target entities than the overall corpora. The CancerBERT
models obtained the best performances among the independent test sets from two
clinical institutes and the permutation test set. The CancerBERT model
developed in one institute and further fine-tuned in another institute achieved
reasonable performance compared to the model developed on local data (micro-F1:
0.925 vs 0.932).
Conclusions: The results indicate the CancerBERT model has the best learning
ability and generalizability among the three types of clinical NLP models. The
generalizability of the models was found to be correlated with the similarity
of the target entities between the corpora.
Related papers
- A Clinical Benchmark of Public Self-Supervised Pathology Foundation Models [2.124312824026935]
We present a collection of pathology datasets comprising clinical slides associated with clinically relevant endpoints including cancer diagnoses and a variety of biomarkers generated during standard hospital operation from two medical centers.
We leverage these datasets to systematically assess the performance of public pathology foundation models and provide insights into best practices for training new foundation models and selecting appropriate pretrained models.
arXiv Detail & Related papers (2024-07-09T02:33:13Z) - RaTEScore: A Metric for Radiology Report Generation [59.37561810438641]
This paper introduces a novel, entity-aware metric, as Radiological Report (Text) Evaluation (RaTEScore)
RaTEScore emphasizes crucial medical entities such as diagnostic outcomes and anatomical details, and is robust against complex medical synonyms and sensitive to negation expressions.
Our evaluations demonstrate that RaTEScore aligns more closely with human preference than existing metrics, validated both on established public benchmarks and our newly proposed RaTE-Eval benchmark.
arXiv Detail & Related papers (2024-06-24T17:49:28Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - Negation detection in Dutch clinical texts: an evaluation of rule-based
and machine learning methods [0.21079694661943607]
We compare three methods for negation detection in Dutch clinical notes.
We found that both the biLSTM and RoBERTa models consistently outperform the rule-based model in terms of F1 score, precision and recall.
arXiv Detail & Related papers (2022-09-01T14:00:13Z) - Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in
Artificial Intelligence [79.038671794961]
We launch the Unified CT-COVID AI Diagnostic Initiative (UCADI), where the AI model can be distributedly trained and independently executed at each host institution.
Our study is based on 9,573 chest computed tomography scans (CTs) from 3,336 patients collected from 23 hospitals located in China and the UK.
arXiv Detail & Related papers (2021-11-18T00:43:41Z) - Clinical Relation Extraction Using Transformer-based Models [28.237302721228435]
We developed a series of clinical RE models based on three transformer architectures, namely BERT, RoBERTa, and XLNet.
We demonstrated that the RoBERTa-clinical RE model achieved the best performance on the 2018 MADE1.0 dataset with an F1-score of 0.8958.
Our results indicated that the binary classification strategy consistently outperformed the multi-class classification strategy for clinical relation extraction.
arXiv Detail & Related papers (2021-07-19T15:15:51Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Adversarial Sample Enhanced Domain Adaptation: A Case Study on
Predictive Modeling with Electronic Health Records [57.75125067744978]
We propose a data augmentation method to facilitate domain adaptation.
adversarially generated samples are used during domain adaptation.
Results confirm the effectiveness of our method and the generality on different tasks.
arXiv Detail & Related papers (2021-01-13T03:20:20Z) - Quantification of BERT Diagnosis Generalizability Across Medical
Specialties Using Semantic Dataset Distance [0.0]
Deep learning models in healthcare may fail to generalize on data from unseen corpora.
No metric exists to tell how existing models will perform on new data.
Model performance on new corpora is directly correlated to the similarity between train and test sentence content.
arXiv Detail & Related papers (2020-08-14T23:44:11Z) - Detecting ulcerative colitis from colon samples using efficient feature
selection and machine learning [1.5484595752241122]
Ulcerative colitis (UC) is one of the most common forms of inflammatory bowel disease (IBD) characterized by inflammation of the mucosal layer of the colon.
We created a model to discriminate between healthy subjects and subjects with UC based on the expression values of 32 genes in colon samples.
Our model perfectly detected all active cases and had an average precision of 0.62 in the inactive cases.
arXiv Detail & Related papers (2020-08-04T14:56:45Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.