Related papers: The Aloe Family Recipe for Open and Specialized Healthcare LLMs

The Aloe Family Recipe for Open and Specialized Healthcare LLMs

URL: http://arxiv.org/abs/2505.04388v2
Date: Wed, 28 May 2025 20:14:44 GMT
Title: The Aloe Family Recipe for Open and Specialized Healthcare LLMs
Authors: Dario Garcia-Gasulla, Jordi Bayarri-Planas, Ashwin Kumar Gururajan, Enrique Lopez-Cuena, Adrian Tormos, Daniel Hinjos, Pablo Bernabeu-Perez, Anna Arias-Duart, Pablo Agustin Martin-Torres, Marta Gonzalez-Mallo, Sergio Alvarez-Napagao, Eduard Ayguadé-Parra, Ulises Cortés,
Abstract summary: This work contributes to the field of open medical LLMs by optimizing key stages of data preprocessing and training.<n>The resultant models, shown to be competitive with the best private alternatives, are released with a permisive license.
Score: 0.49264222302472466
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Purpose: With advancements in Large Language Models (LLMs) for healthcare, the need arises for competitive open-source models to protect the public interest. This work contributes to the field of open medical LLMs by optimizing key stages of data preprocessing and training, while showing how to improve model safety (through DPO) and efficacy (through RAG). The evaluation methodology used, which includes four different types of tests, defines a new standard for the field. The resultant models, shown to be competitive with the best private alternatives, are released with a permisive license. Methods: Building on top of strong base models like Llama 3.1 and Qwen 2.5, Aloe Beta uses a custom dataset to enhance public data with synthetic Chain of Thought examples. The models undergo alignment with Direct Preference Optimization, emphasizing ethical and policy-aligned performance in the presence of jailbreaking attacks. Evaluation includes close-ended, open-ended, safety and human assessments, to maximize the reliability of results. Results: Recommendations are made across the entire pipeline, backed by the solid performance of the Aloe Family. These models deliver competitive performance across healthcare benchmarks and medical fields, and are often preferred by healthcare professionals. On bias and toxicity, the Aloe Beta models significantly improve safety, showing resilience to unseen jailbreaking attacks. For a responsible release, a detailed risk assessment specific to healthcare is attached to the Aloe Family models. Conclusion: The Aloe Beta models, and the recipe that leads to them, are a significant contribution to the open-source medical LLM field, offering top-of-the-line performance while maintaining high ethical requirements. This work sets a new standard for developing and reporting aligned LLMs in healthcare.

Related papers

PatientDx: Merging Large Language Models for Protecting Data-Privacy in Healthcare [2.1046377530356764]
Fine-tuning of Large Language Models (LLMs) has become the default practice for improving model performance on a given task.<n>PatientDx is a framework of model merging that allows the design of effective LLMs for health-predictive tasks without requiring fine-tuning nor adaptation on patient data.
arXiv Detail & Related papers (2025-04-24T08:21:04Z)
More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment [80.04449725137177]
Direct Preference Optimization (DPO) has emerged as a simple, yet effective alternative to reinforcement learning from human feedback.<n>Our study reveals a striking, safety-specific phenomenon associated with DPO alignment.<n>Using solely self-generated responses for both chosen and rejected pairs significantly outperforms configurations that incorporate responses from stronger models.
arXiv Detail & Related papers (2025-04-03T00:36:40Z)
Open Foundation Models in Healthcare: Challenges, Paradoxes, and Opportunities with GenAI Driven Personalized Prescription [3.9083860193371938]
In response to the success of proprietary Large Language Models (LLMs) such as OpenAI's GPT-4, there is a growing interest in developing open, non-proprietary AI foundation models (AIFMs)<n>Despite their inability to match the refined functionalities of their proprietary counterparts, open models hold immense potential to revolutionize healthcare applications.
arXiv Detail & Related papers (2025-02-04T19:16:56Z)
Aligning (Medical) LLMs for (Counterfactual) Fairness [2.089191490381739]
Large Language Models (LLMs) have emerged as promising solutions for medical and clinical decision support applications. LLMs are subject to different types of biases, which can lead to unfair treatment of individuals, worsening health disparities, and reducing trust in AI-augmented medical tools. We present a new model alignment approach for aligning LLMs using a preference optimization method within a knowledge distillation framework.
arXiv Detail & Related papers (2024-08-22T01:11:27Z)
ShieldGemma: Generative AI Content Moderation Based on Gemma [49.91147965876678]
ShieldGemma is a suite of safety content moderation models built upon Gemma2. Models provide robust, state-of-the-art predictions of safety risks across key harm types.
arXiv Detail & Related papers (2024-07-31T17:48:14Z)
MedLeak: Multimodal Medical Data Leakage in Secure Federated Learning with Crafted Models [20.884070284666105]
Federated learning (FL) allows participants to collaboratively train machine learning models while keeping their data local.<n>We propose a novel privacy attack called MedLeak, which allows a malicious FL server to recover high-quality site-specific private medical data.
arXiv Detail & Related papers (2024-07-13T18:31:35Z)
Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment [104.18002641195442]
We introduce Self-Augmented Preference Optimization (SAPO), an effective and scalable training paradigm that does not require existing paired data. Building on the self-play concept, which autonomously generates negative responses, we further incorporate an off-policy learning pipeline to enhance data exploration and exploitation.
arXiv Detail & Related papers (2024-05-31T14:21:04Z)
Aloe: A Family of Fine-tuned Open Healthcare LLMs [0.0]
We introduce the Aloe family, a set of open medical LLMs highly competitive within its scale range. Aloe models undergo an alignment phase, becoming one of the first few policy-aligned open healthcare LLMs. To explore the limits of current LLMs in inference, we study several advanced prompt engineering strategies.
arXiv Detail & Related papers (2024-05-03T07:14:07Z)
Large Language Model Distilling Medication Recommendation Model [58.94186280631342]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)<n>Our research aims to transform existing medication recommendation methodologies using LLMs.<n>To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z)
Improving Fairness in AI Models on Electronic Health Records: The Case for Federated Learning Methods [0.0]
We show one possible approach to mitigate bias concerns by having healthcare institutions collaborate through a federated learning paradigm. We propose a comprehensive FL approach with adversarial debiasing and a fair aggregation method, suitable to various fairness metrics. Our method has achieved promising fairness performance with the lowest impact on overall discrimination performance (accuracy)
arXiv Detail & Related papers (2023-05-19T02:03:49Z)
MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data [37.60056509129154]
Large language models (LLMs) hold considerable promise for improving medical, diagnostics, patient care, and education.<n>Yet, there is an urgent need for open-source models that can be deployed on-premises to safeguard patient privacy.<n>We present an innovative dataset consisting of over 160,000 entries, specifically crafted to fine-tune LLMs for effective medical applications.
arXiv Detail & Related papers (2023-04-14T11:28:08Z)
Privacy-preserving medical image analysis [53.4844489668116]
We present PriMIA, a software framework designed for privacy-preserving machine learning (PPML) in medical imaging. We show significantly better classification performance of a securely aggregated federated learning model compared to human experts on unseen datasets. We empirically evaluate the framework's security against a gradient-based model inversion attack.
arXiv Detail & Related papers (2020-12-10T13:56:00Z)
UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model. UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data. We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD) UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.