Clinical BioBERT Hyperparameter Optimization using Genetic Algorithm
Clinical BioBERT Hyperparameter Optimization using Genetic Algorithm
- URL: http://arxiv.org/abs/2302.03822v1
- Date: Wed, 8 Feb 2023 01:11:59 GMT
- Title: Clinical BioBERT Hyperparameter Optimization using Genetic Algorithm
Clinical BioBERT Hyperparameter Optimization using Genetic Algorithm
- Authors: Navya Martin Kollapally, James Geller
- Abstract summary: Social Determinants of Health (SDoH) are collectively referred to as Social Determinants of Health (SDoH)
The majority of SDoH data is recorded in unstructured clinical notes by physicians and practitioners.
Our research focuses on extracting sentences from clinical notes to provide appropriate concepts.
- Score: 0.15229257192293197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clinical factors account only for a small portion, about 10-30%, of the
controllable factors that affect an individual's health outcomes. The remaining
factors include where a person was born and raised, where he/she pursued their
education, what their work and family environment is like, etc. These factors
are collectively referred to as Social Determinants of Health (SDoH). The
majority of SDoH data is recorded in unstructured clinical notes by physicians
and practitioners. Recording SDoH data in a structured manner (in an EHR) could
greatly benefit from a dedicated ontology of SDoH terms. Our research focuses
on extracting sentences from clinical notes, making use of such an SDoH
ontology (called SOHO) to provide appropriate concepts. We utilize recent
advancements in Deep Learning to optimize the hyperparameters of a Clinical
BioBERT model for SDoH text. A genetic algorithm-based hyperparameter tuning
regimen was implemented to identify optimal parameter settings. To implement a
complete classifier, we pipelined Clinical BioBERT with two subsequent linear
layers and two dropout layers. The output predicts whether a text fragment
describes an SDoH issue of the patient. We compared the AdamW, Adafactor, and
LAMB optimizers. In our experiments, AdamW outperformed the others in terms of
accuracy.
Related papers
- Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study [2.0884301753594334]
This study performs a comparative analysis of various natural language models for medical text classification.
BERT outperforms Bi-LSTM models by up to 28% and the baseline BERT model by up to 16% for recall of the minority classes.
arXiv Detail & Related papers (2024-08-30T10:28:49Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Towards Understanding the Survival of Patients with High-Grade
Gastroenteropancreatic Neuroendocrine Neoplasms: An Investigation of Ensemble
Feature Selection in the Prediction of Overall Survival [0.0]
Ensemble feature selectors allow the user to identify such features in datasets with low sample sizes.
RENT and UBayFS are capable of integrating expert knowledge a priori in the feature selection process.
Our results demonstrate that both feature selectors allow accurate predictions, and that expert knowledge has a stabilizing effect on the feature set.
arXiv Detail & Related papers (2023-02-20T17:08:03Z) - Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records.
We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data.
We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z) - Survival Prediction of Children Undergoing Hematopoietic Stem Cell
Transplantation Using Different Machine Learning Classifiers by Performing
Chi-squared Test and Hyper-parameter Optimization: A Retrospective Analysis [4.067706269490143]
An efficient survival classification model is presented in a comprehensive manner.
A synthetic dataset is generated by imputing the missing values, transforming the data using dummy variable encoding, and compressing the dataset from 59 features to the 11 most correlated features using Chi-squared feature selection.
Several supervised ML methods were trained in this regard, like Decision Tree, Random Forest, Logistic Regression, K-Nearest Neighbors, Gradient Boosting, Ada Boost, and XG Boost.
arXiv Detail & Related papers (2022-01-22T08:01:22Z) - Online Optimization of Stimulation Speed in an Auditory Brain-Computer
Interface under Time Constraints [5.695163312473305]
We propose an approach to exploit the benefits of individualized experimental protocols and evaluated it in an auditory BCI.
Our work proposes an approach to exploit the benefits of individualized experimental protocols and evaluated it in an auditory BCI.
arXiv Detail & Related papers (2021-08-26T08:18:03Z) - Resource Planning for Hospitals Under Special Consideration of the
COVID-19 Pandemic: Optimization and Sensitivity Analysis [87.31348761201716]
Crises like the COVID-19 pandemic pose a serious challenge to health-care institutions.
BaBSim.Hospital is a tool for capacity planning based on discrete event simulation.
We aim to investigate and optimize these parameters to improve BaBSim.Hospital.
arXiv Detail & Related papers (2021-05-16T12:38:35Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z) - DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret [59.81290762273153]
Dynamic treatment regimes (DTRs) are personalized, adaptive, multi-stage treatment plans that adapt treatment decisions to an individual's initial features and to intermediate outcomes and features at each subsequent stage.
We propose a novel algorithm that, by carefully balancing exploration and exploitation, is guaranteed to achieve rate-optimal regret when the transition and reward models are linear.
arXiv Detail & Related papers (2020-05-06T13:03:42Z) - Optimization of Genomic Classifiers for Clinical Deployment: Evaluation
of Bayesian Optimization to Select Predictive Models of Acute Infection and
In-Hospital Mortality [0.0]
characterization of a patient's immune response by quantifying expression levels of specific genes from blood represents a potentially more timely and precise means of accomplishing both tasks.
Machine learning methods provide a platform to leverage this 'host response' for development of deployment-ready classification models.
We compare HO approaches for the development of diagnostic classifiers of acute infection and in-hospital mortality from gene expression of 29 diagnostic markers.
arXiv Detail & Related papers (2020-03-27T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.