Bio+Clinical BERT, BERT Base, and CNN Performance Comparison for
Predicting Drug-Review Satisfaction
- URL: http://arxiv.org/abs/2308.03782v1
- Date: Wed, 2 Aug 2023 20:01:38 GMT
- Title: Bio+Clinical BERT, BERT Base, and CNN Performance Comparison for
Predicting Drug-Review Satisfaction
- Authors: Yue Ling
- Abstract summary: We implement and evaluate several classification models, including a BERT base model, Bio+Clinical BERT, and a simpler CNN.
Results indicate that the medical domain-specific Bio+Clinical BERT model significantly outperformed the general domain base BERT model.
Future research could explore how to capitalize on the specific strengths of each model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The objective of this study is to develop natural language processing (NLP)
models that can analyze patients' drug reviews and accurately classify their
satisfaction levels as positive, neutral, or negative. Such models would reduce
the workload of healthcare professionals and provide greater insight into
patients' quality of life, which is a critical indicator of treatment
effectiveness. To achieve this, we implemented and evaluated several
classification models, including a BERT base model, Bio+Clinical BERT, and a
simpler CNN. Results indicate that the medical domain-specific Bio+Clinical
BERT model significantly outperformed the general domain base BERT model,
achieving macro f1 and recall score improvement of 11%, as shown in Table 2.
Future research could explore how to capitalize on the specific strengths of
each model. Bio+Clinical BERT excels in overall performance, particularly with
medical jargon, while the simpler CNN demonstrates the ability to identify
crucial words and accurately classify sentiment in texts with conflicting
sentiments.
Related papers
- Synthetic4Health: Generating Annotated Synthetic Clinical Letters [6.822926897514792]
Since clinical letters contain sensitive information, clinical-related datasets can not be widely applied in model training, medical research, and teaching.
This work aims to generate reliable, various, and de-identified synthetic clinical letters.
arXiv Detail & Related papers (2024-09-14T18:15:07Z) - Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study [2.0884301753594334]
This study performs a comparative analysis of various natural language models for medical text classification.
BERT outperforms Bi-LSTM models by up to 28% and the baseline BERT model by up to 16% for recall of the minority classes.
arXiv Detail & Related papers (2024-08-30T10:28:49Z) - Improving Biomedical Entity Linking with Retrieval-enhanced Learning [53.24726622142558]
$k$NN-BioEL provides a BioEL model with the ability to reference similar instances from the entire training corpus as clues for prediction.
We show that $k$NN-BioEL outperforms state-of-the-art baselines on several datasets.
arXiv Detail & Related papers (2023-12-15T14:04:23Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Assessment of contextualised representations in detecting outcome
phrases in clinical trials [14.584741378279316]
We introduce "EBM-COMET", a dataset in which 300 PubMed abstracts are expertly annotated for clinical outcomes.
To extract outcomes, we fine-tune a variety of pre-trained contextualized representations.
We observe our best model (BioBERT) achieve 81.5% F1, 81.3% sensitivity and 98.0% specificity.
arXiv Detail & Related papers (2022-02-13T15:08:00Z) - Fine-Tuning Large Neural Language Models for Biomedical Natural Language
Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP.
We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains.
We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z) - Improving the robustness and accuracy of biomedical language models
through adversarial training [7.064032374579076]
Deep transformer neural network models have improved the predictive accuracy of intelligent text processing systems in the biomedical domain.
Neural NLP models can be easily fooled by adversarial samples, i.e. minor changes to input that preserve the meaning and understandability of the text but force the NLP system to make erroneous decisions.
This raises serious concerns about the security and trust-worthiness of biomedical NLP systems.
arXiv Detail & Related papers (2021-11-16T14:58:05Z) - CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark.
It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification.
We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z) - Extracting Lifestyle Factors for Alzheimer's Disease from Clinical Notes
Using Deep Learning with Weak Supervision [9.53786612243512]
The objective of the study was to demonstrate the feasibility of natural language processing (NLP) models to classify lifestyle factors.
We performed two case studies: physical activity and excessive diet, in order to validate the effectiveness of BERT models.
The proposed approach leveraging weak supervision could significantly increase the sample size.
arXiv Detail & Related papers (2021-01-22T17:55:03Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.