Considerations for health care institutions training large language
models on electronic health records
- URL: http://arxiv.org/abs/2309.12339v1
- Date: Thu, 24 Aug 2023 00:09:01 GMT
- Title: Considerations for health care institutions training large language
models on electronic health records
- Authors: Weipeng Zhou, Danielle Bitterman, Majid Afshar, Timothy A. Miller
- Abstract summary: Large language models (LLMs) like ChatGPT have excited scientists across fields.
In medicine, one source of excitement is the potential applications of LLMs trained on electronic health record ( EHR) data.
But there are tough questions we must first answer if health care institutions are interested in having LLMs trained on their own data.
- Score: 7.048517095805301
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large language models (LLMs) like ChatGPT have excited scientists across
fields; in medicine, one source of excitement is the potential applications of
LLMs trained on electronic health record (EHR) data. But there are tough
questions we must first answer if health care institutions are interested in
having LLMs trained on their own data; should they train an LLM from scratch or
fine-tune it from an open-source model? For healthcare institutions with a
predefined budget, what are the biggest LLMs they can afford? In this study, we
take steps towards answering these questions with an analysis on dataset sizes,
model sizes, and costs for LLM training using EHR data. This analysis provides
a framework for thinking about these questions in terms of data scale, compute
scale, and training budgets.
Related papers
- A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations [5.265452667976959]
Large Language Models (LLMs) have demonstrated surprising performance across various natural language processing tasks.
This survey systematically explores how to train medical LLMs based on general LLMs.
arXiv Detail & Related papers (2024-06-14T02:42:20Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Understanding the concerns and choices of public when using large
language models for healthcare [18.906110107170697]
Large language models (LLMs) have shown their potential in biomedical fields.
How the public uses them for healthcare purposes such as medical Q&A, self-diagnosis, and daily healthcare information seeking is under-investigated.
arXiv Detail & Related papers (2024-01-17T09:51:32Z) - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain.
ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF.
We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z) - A Survey of Large Language Models in Medicine: Progress, Application, and Challenge [85.09998659355038]
Large language models (LLMs) have received substantial attention due to their capabilities for understanding and generating human language.
This review aims to provide a detailed overview of the development and deployment of LLMs in medicine.
arXiv Detail & Related papers (2023-11-09T02:55:58Z) - LLM-based Medical Assistant Personalization with Short- and Long-Term Memory Coordination [20.269899169364397]
Large Language Models (LLMs) have exhibited remarkable proficiency in comprehending and generating natural language.
We propose a novel computational bionic memory mechanism, equipped with a parameter-efficient fine-tuning (PEFT) schema, to personalize medical assistants.
arXiv Detail & Related papers (2023-09-21T00:34:33Z) - Balanced and Explainable Social Media Analysis for Public Health with
Large Language Models [13.977401672173533]
Current techniques for public health analysis involve popular models such as BERT and large language models (LLMs)
To tackle these challenges, the data imbalance issue can be overcome by sophisticated data augmentation methods for social media datasets.
In this paper, a novel ALEX framework is proposed for social media analysis on public health.
arXiv Detail & Related papers (2023-09-12T04:15:34Z) - Augmenting Black-box LLMs with Medical Textbooks for Clinical Question
Answering [54.13933019557655]
We present a system called LLMs Augmented with Medical Textbooks (LLM-AMT)
LLM-AMT integrates authoritative medical textbooks into the LLMs' framework using plug-and-play modules.
We found that medical textbooks as a retrieval corpus is proven to be a more effective knowledge database than Wikipedia in the medical domain.
arXiv Detail & Related papers (2023-09-05T13:39:38Z) - MedAlign: A Clinician-Generated Dataset for Instruction Following with
Electronic Medical Records [60.35217378132709]
Large language models (LLMs) can follow natural language instructions with human-level fluency.
evaluating LLMs on realistic text generation tasks for healthcare remains challenging.
We introduce MedAlign, a benchmark dataset of 983 natural language instructions for EHR data.
arXiv Detail & Related papers (2023-08-27T12:24:39Z) - Aligning Large Language Models with Human: A Survey [53.6014921995006]
Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks.
Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect information.
This survey presents a comprehensive overview of these alignment technologies, including the following aspects.
arXiv Detail & Related papers (2023-07-24T17:44:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.