MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
- URL: http://arxiv.org/abs/2311.16079v1
- Date: Mon, 27 Nov 2023 18:49:43 GMT
- Title: MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
- Authors: Zeming Chen, Alejandro Hern\'andez Cano, Angelika Romanou, Antoine
Bonnet, Kyle Matoba, Francesco Salvi, Matteo Pagliardini, Simin Fan, Andreas
K\"opf, Amirkeivan Mohtashami, Alexandre Sallinen, Alireza Sakhaeirad,
Vinitra Swamy, Igor Krawczuk, Deniz Bayazit, Axel Marmet, Syrielle Montariol,
Mary-Anne Hartley, Martin Jaggi, Antoine Bosselut
- Abstract summary: Large language models (LLMs) can potentially democratize access to medical knowledge.
We release MEDITRON: a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain.
- Score: 91.25119823784705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) can potentially democratize access to medical
knowledge. While many efforts have been made to harness and improve LLMs'
medical knowledge and reasoning capacities, the resulting models are either
closed-source (e.g., PaLM, GPT-4) or limited in scale (<= 13B parameters),
which restricts their abilities. In this work, we improve access to large-scale
medical LLMs by releasing MEDITRON: a suite of open-source LLMs with 7B and 70B
parameters adapted to the medical domain. MEDITRON builds on Llama-2 (through
our adaptation of Nvidia's Megatron-LM distributed trainer), and extends
pretraining on a comprehensively curated medical corpus, including selected
PubMed articles, abstracts, and internationally-recognized medical guidelines.
Evaluations using four major medical benchmarks show significant performance
gains over several state-of-the-art baselines before and after task-specific
finetuning. Overall, MEDITRON achieves a 6% absolute performance gain over the
best public baseline in its parameter class and 3% over the strongest baseline
we finetuned from Llama-2. Compared to closed-source LLMs, MEDITRON-70B
outperforms GPT-3.5 and Med-PaLM and is within 5% of GPT-4 and 10% of
Med-PaLM-2. We release our code for curating the medical pretraining corpus and
the MEDITRON model weights to drive open-source development of more capable
medical LLMs.
Related papers
- MEG: Medical Knowledge-Augmented Large Language Models for Question Answering [37.3562521243773]
We present MEG, a parameter-efficient approach for medical knowledge-augmented LLMs.
We evaluate our method on four popular medical multiple-choice datasets.
arXiv Detail & Related papers (2024-11-06T12:57:58Z) - Towards Evaluating and Building Versatile Large Language Models for Medicine [57.49547766838095]
We present MedS-Bench, a benchmark designed to evaluate the performance of large language models (LLMs) in clinical contexts.
MedS-Bench spans 11 high-level clinical tasks, including clinical report summarization, treatment recommendations, diagnosis, named entity recognition, and medical concept explanation.
MedS-Ins comprises 58 medically oriented language corpora, totaling 13.5 million samples across 122 tasks.
arXiv Detail & Related papers (2024-08-22T17:01:34Z) - Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People [68.59917533894608]
We aim to develop medical LLMs across the six most widely spoken languages, encompassing a global population of 6.1 billion.
This effort culminates in the creation of the ApolloCorpora multilingual medical dataset and the XMedBench benchmark.
We will open-source training corpora, code, model weights and evaluation benchmark.
arXiv Detail & Related papers (2024-03-06T11:56:02Z) - OpenMedLM: Prompt engineering can out-perform fine-tuning in medical
question-answering with open-source large language models [4.556924372105915]
Open-source (OS) models represent a key area of growth for medical LLMs.
We present OpenMedLM, a prompting platform which delivers state-of-the-art (SOTA) performance for OS LLMs on medical benchmarks.
arXiv Detail & Related papers (2024-02-29T17:19:39Z) - Towards Building Multilingual Language Model for Medicine [54.1382395897071]
We construct a multilingual medical corpus, containing approximately 25.5B tokens encompassing 6 main languages.
We propose a multilingual medical multi-choice question-answering benchmark with rationale, termed as MMedBench.
Our final model, MMed-Llama 3, with only 8B parameters, achieves superior performance compared to all other open-source models on both MMedBench and English benchmarks.
arXiv Detail & Related papers (2024-02-21T17:47:20Z) - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain.
ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF.
We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z) - Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model [41.11769935795965]
We present a multi-stage training method combining Domain-specific Continued Pre-training (DCPT), Supervised Fine-tuning (SFT), and Direct Preference Optimization (DPO)
In the CPT and SFT phases, Qilin-Med achieved 38.4% and 40.0% accuracy on the CMExam test set, respectively.
In the DPO phase, it scored 16.66 in BLEU-1 and 27.44 in ROUGE-1 on the Huatuo-26M test set, bringing further improvement to the SFT phase (12.69 in BLEU-1 and 24.21 in ROUGE-1)
arXiv Detail & Related papers (2023-10-13T13:17:03Z) - Augmenting Black-box LLMs with Medical Textbooks for Biomedical Question Answering (Published in Findings of EMNLP 2024) [48.17095875619711]
We present a system called LLMs Augmented with Medical Textbooks (LLM-AMT)
LLM-AMT integrates authoritative medical textbooks into the LLMs' framework using plug-and-play modules.
We found that medical textbooks as a retrieval corpus is proven to be a more effective knowledge database than Wikipedia in the medical domain.
arXiv Detail & Related papers (2023-09-05T13:39:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.