A Study of Generative Large Language Model for Medical Research and
Healthcare
- URL: http://arxiv.org/abs/2305.13523v1
- Date: Mon, 22 May 2023 22:37:24 GMT
- Title: A Study of Generative Large Language Model for Medical Research and
Healthcare
- Authors: Cheng Peng, Xi Yang, Aokun Chen, Kaleb E Smith, Nima PourNejatian,
Anthony B Costa, Cheryl Martin, Mona G Flores, Ying Zhang, Tanja Magoc,
Gloria Lipori, Duane A Mitchell, Naykky S Ospina, Mustafa M Ahmed, William R
Hogan, Elizabeth A Shenkman, Yi Guo, Jiang Bian, Yonghui Wu
- Abstract summary: This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters.
Synthetic NLP models trained using GatorTronGPT generated text outperform NLP models trained using real-world clinical text.
- Score: 25.361547229585184
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is enormous enthusiasm and concerns in using large language models
(LLMs) in healthcare, yet current assumptions are all based on general-purpose
LLMs such as ChatGPT. This study develops a clinical generative LLM,
GatorTronGPT, using 277 billion words of mixed clinical and English text with a
GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical
natural language processing for medical research. Synthetic NLP models trained
using GatorTronGPT generated text outperform NLP models trained using
real-world clinical text. Physicians Turing test using 1 (worst) to 9 (best)
scale shows that there is no significant difference in linguistic readability
(p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical
relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that
physicians cannot differentiate them (p < 0.001). This study provides insights
on the opportunities and challenges of LLMs for medical research and
healthcare.
Related papers
- IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials [4.679320772294786]
Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks.
This research investigates LLMs' robustness, consistency, and faithful reasoning when performing Natural Language Inference (NLI) on breast cancer Clinical Trial Reports (CTRs)
We examine the reasoning capabilities of LLMs and their adeptness at logical problem-solving.
arXiv Detail & Related papers (2024-04-06T05:44:53Z) - BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text [82.7001841679981]
BioMedLM is a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles.
When fine-tuned, BioMedLM can produce strong multiple-choice biomedical question-answering results competitive with larger models.
BioMedLM can also be fine-tuned to produce useful answers to patient questions on medical topics.
arXiv Detail & Related papers (2024-03-27T10:18:21Z) - Automatic Summarization of Doctor-Patient Encounter Dialogues Using Large Language Model through Prompt Tuning [20.9626587328674]
This study presents an approach to summarize doctor-patient dialogues using generative large language models (LLMs)
We developed prompt-tuning algorithms to instruct generative LLMs to summarize clinical text.
arXiv Detail & Related papers (2024-03-19T18:37:05Z) - MedAlign: A Clinician-Generated Dataset for Instruction Following with
Electronic Medical Records [60.35217378132709]
Large language models (LLMs) can follow natural language instructions with human-level fluency.
evaluating LLMs on realistic text generation tasks for healthcare remains challenging.
We introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data.
arXiv Detail & Related papers (2023-08-27T12:24:39Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Large Language Models Leverage External Knowledge to Extend Clinical
Insight Beyond Language Boundaries [48.48630043740588]
Large Language Models (LLMs) such as ChatGPT and Med-PaLM have excelled in various medical question-answering tasks.
We develop a novel in-context learning framework to enhance their performance.
arXiv Detail & Related papers (2023-05-17T12:31:26Z) - Evaluation of ChatGPT Family of Models for Biomedical Reasoning and
Classification [6.163540203358258]
This study investigates the performance of large language models (LLMs) in biomedical tasks beyond question-answering.
Because no patient data can be passed to the OpenAI API public interface, we evaluated model performance with over 10000 samples.
We found that fine-tuning for two fundamental NLP tasks remained the best strategy.
arXiv Detail & Related papers (2023-04-05T15:11:25Z) - Contextualized Medication Information Extraction Using Transformer-based
Deep Learning Architectures [35.65283211002216]
We developed NLP systems for medication mention extraction, event classification (indicating medication changes discussed or not), and context classification.
We explored 6 state-of-the-art pretrained transformer models for the three subtasks, including GatorTron, a large language model pretrained using >90 billion words of text.
Our GatorTron models achieved the best F1-scores of 0.9828 for medication extraction (ranked 3rd), 0.9379 for event classification (ranked 2nd), and the best micro-average accuracy of 0.9126 for context classification.
arXiv Detail & Related papers (2023-03-14T22:22:28Z) - BioGPT: Generative Pre-trained Transformer for Biomedical Text
Generation and Mining [140.61707108174247]
We propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature.
We get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA.
arXiv Detail & Related papers (2022-10-19T07:17:39Z) - Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of
Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings.
We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z) - GatorTron: A Large Clinical Language Model to Unlock Patient Information
from Unstructured Electronic Health Records [22.652798872046283]
There is an increasing interest in developing artificial intelligence (AI) systems to process and interpret electronic health records ( EHRs)
There are few clinical language models, the largest of which trained in the clinical domain is comparatively small at 110 million parameters.
It is not clear how large clinical language models with billions of parameters can help medical AI systems utilize unstructured EHRs.
arXiv Detail & Related papers (2022-02-02T14:28:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.