Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People
- URL: http://arxiv.org/abs/2403.03640v6
- Date: Sat, 12 Oct 2024 14:09:33 GMT
- Title: Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People
- Authors: Xidong Wang, Nuo Chen, Junyin Chen, Yidong Wang, Guorui Zhen, Chunxian Zhang, Xiangbo Wu, Yan Hu, Anningzhe Gao, Xiang Wan, Haizhou Li, Benyou Wang,
- Abstract summary: We aim to develop medical LLMs across the six most widely spoken languages, encompassing a global population of 6.1 billion.
This effort culminates in the creation of the ApolloCorpora multilingual medical dataset and the XMedBench benchmark.
We will open-source training corpora, code, model weights and evaluation benchmark.
- Score: 68.59917533894608
- License:
- Abstract: Despite the vast repository of global medical knowledge predominantly being in English, local languages are crucial for delivering tailored healthcare services, particularly in areas with limited medical resources. To extend the reach of medical AI advancements to a broader population, we aim to develop medical LLMs across the six most widely spoken languages, encompassing a global population of 6.1 billion. This effort culminates in the creation of the ApolloCorpora multilingual medical dataset and the XMedBench benchmark. In the multilingual medical benchmark, the released Apollo models, at various relatively-small sizes (i.e., 0.5B, 1.8B, 2B, 6B, and 7B), achieve the best performance among models of equivalent size. Especially, Apollo-7B is the state-of-the-art multilingual medical LLMs up to 70B. Additionally, these lite models could be used to improve the multi-lingual medical capabilities of larger models without fine-tuning in a proxy-tuning fashion. We will open-source training corpora, code, model weights and evaluation benchmark.
Related papers
- Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources [0.0]
We present a medical adaptation based on the recent 7B models, which enables the operation in low computational resources.
We find that fine-tuning an English-centric base model on Japanese medical dataset improves the score in both language.
arXiv Detail & Related papers (2024-09-18T08:07:37Z) - Towards Building Multilingual Language Model for Medicine [54.1382395897071]
We construct a multilingual medical corpus, containing approximately 25.5B tokens encompassing 6 main languages.
We propose a multilingual medical multi-choice question-answering benchmark with rationale, termed as MMedBench.
Our final model, MMed-Llama 3, with only 8B parameters, achieves superior performance compared to all other open-source models on both MMedBench and English benchmarks.
arXiv Detail & Related papers (2024-02-21T17:47:20Z) - BiMediX: Bilingual Medical Mixture of Experts LLM [94.85518237963535]
We introduce BiMediX, the first bilingual medical mixture of experts LLM designed for seamless interaction in both English and Arabic.
Our model facilitates a wide range of medical interactions in English and Arabic, including multi-turn chats to inquire about additional details.
We propose a semi-automated English-to-Arabic translation pipeline with human refinement to ensure high-quality translations.
arXiv Detail & Related papers (2024-02-20T18:59:26Z) - BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains [8.448541067852]
Large Language Models (LLMs) have demonstrated remarkable versatility in recent years.
Despite the availability of various open-source LLMs tailored for health contexts, adapting general-purpose LLMs to the medical domain presents significant challenges.
We introduce BioMistral, an open-source LLM tailored for the biomedical domain, utilizing Mistral as its foundation model.
arXiv Detail & Related papers (2024-02-15T23:39:04Z) - MEDITRON-70B: Scaling Medical Pretraining for Large Language Models [91.25119823784705]
Large language models (LLMs) can potentially democratize access to medical knowledge.
We release MEDITRON: a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain.
arXiv Detail & Related papers (2023-11-27T18:49:43Z) - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences [51.66185471742271]
We propose ChiMed-GPT, a benchmark LLM designed explicitly for Chinese medical domain.
ChiMed-GPT undergoes a comprehensive training regime with pre-training, SFT, and RLHF.
We analyze possible biases through prompting ChiMed-GPT to perform attitude scales regarding discrimination of patients.
arXiv Detail & Related papers (2023-11-10T12:25:32Z) - Baichuan 2: Open Large-scale Language Models [51.56361715162972]
We present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens.
Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval.
arXiv Detail & Related papers (2023-09-19T04:13:22Z) - Localising In-Domain Adaptation of Transformer-Based Biomedical Language
Models [0.987336898133886]
We present two approaches to derive biomedical language models in languages other than English.
One is based on neural machine translation of English resources, favoring quantity over quality.
The other is based on a high-grade, narrow-scoped corpus written in Italian, thus preferring quality over quantity.
arXiv Detail & Related papers (2022-12-20T16:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.