Aligning (Medical) LLMs for (Counterfactual) Fairness
- URL: http://arxiv.org/abs/2408.12055v1
- Date: Thu, 22 Aug 2024 01:11:27 GMT
- Title: Aligning (Medical) LLMs for (Counterfactual) Fairness
- Authors: Raphael Poulain, Hamed Fayyaz, Rahmatollah Beheshti,
- Abstract summary: Large Language Models (LLMs) have emerged as promising solutions for medical and clinical decision support applications.
LLMs are subject to different types of biases, which can lead to unfair treatment of individuals, worsening health disparities, and reducing trust in AI-augmented medical tools.
We present a new model alignment approach for aligning LLMs using a preference optimization method within a knowledge distillation framework.
- Score: 2.089191490381739
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have emerged as promising solutions for a variety of medical and clinical decision support applications. However, LLMs are often subject to different types of biases, which can lead to unfair treatment of individuals, worsening health disparities, and reducing trust in AI-augmented medical tools. Aiming to address this important issue, in this study, we present a new model alignment approach for aligning LLMs using a preference optimization method within a knowledge distillation framework. Prior to presenting our proposed method, we first use an evaluation framework to conduct a comprehensive (largest to our knowledge) empirical evaluation to reveal the type and nature of existing biases in LLMs used for medical applications. We then offer a bias mitigation technique to reduce the unfair patterns in LLM outputs across different subgroups identified by the protected attributes. We show that our mitigation method is effective in significantly reducing observed biased patterns. Our code is publicly available at \url{https://github.com/healthylaife/FairAlignmentLLM}.
Related papers
- Using Large Language Models for Expert Prior Elicitation in Predictive Modelling [53.54623137152208]
This study proposes using large language models (LLMs) to elicit expert prior distributions for predictive models.
We compare LLM-elicited and uninformative priors, evaluate whether LLMs truthfully generate parameter distributions, and propose a model selection strategy for in-context learning and prior elicitation.
Our findings show that LLM-elicited prior parameter distributions significantly reduce predictive error compared to uninformative priors in low-data settings.
arXiv Detail & Related papers (2024-11-26T10:13:39Z) - How Can We Diagnose and Treat Bias in Large Language Models for Clinical Decision-Making? [2.7476176772825904]
This research investigates the evaluation and mitigation of bias in Large Language Models (LLMs)
We introduce a novel Counterfactual Patient Variations (CPV) dataset derived from the JAMA Clinical Challenge.
Using this dataset, we built a framework for bias evaluation, employing both Multiple Choice Questions (MCQs) and corresponding explanations.
arXiv Detail & Related papers (2024-10-21T23:14:10Z) - Mitigating Hallucinations of Large Language Models in Medical Information Extraction via Contrastive Decoding [92.32881381717594]
We introduce ALternate Contrastive Decoding (ALCD) to solve hallucination issues in medical information extraction tasks.
ALCD demonstrates significant improvements in resolving hallucination issues compared to conventional decoding methods.
arXiv Detail & Related papers (2024-10-21T07:19:19Z) - Enabling Scalable Evaluation of Bias Patterns in Medical LLMs [2.089191490381739]
Large language models (LLMs) have shown impressive potential in helping with numerous medical challenges.
One major area of concern relates to biased behaviors of LLMs in medical applications, leading to unfair treatment of individuals.
We present a new method to scale up such bias evaluations by automatically generating test cases based on rigorous medical evidence.
arXiv Detail & Related papers (2024-10-18T14:17:03Z) - Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge [84.34545223897578]
Despite their excellence in many domains, potential issues are under-explored, undermining their reliability and the scope of their utility.
We identify 12 key potential biases and propose a new automated bias quantification framework-CALM- which quantifies and analyzes each type of bias in LLM-as-a-Judge.
Our work highlights the need for stakeholders to address these issues and remind users to exercise caution in LLM-as-a-Judge applications.
arXiv Detail & Related papers (2024-10-03T17:53:30Z) - A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities.
Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning.
We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z) - Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data [102.16105233826917]
Learning from preference labels plays a crucial role in fine-tuning large language models.
There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learning.
arXiv Detail & Related papers (2024-04-22T17:20:18Z) - Large Language Model Distilling Medication Recommendation Model [61.89754499292561]
We harness the powerful semantic comprehension and input-agnostic characteristics of Large Language Models (LLMs)
Our research aims to transform existing medication recommendation methodologies using LLMs.
To mitigate this, we have developed a feature-level knowledge distillation technique, which transfers the LLM's proficiency to a more compact model.
arXiv Detail & Related papers (2024-02-05T08:25:22Z) - Improving Fairness in AI Models on Electronic Health Records: The Case
for Federated Learning Methods [0.0]
We show one possible approach to mitigate bias concerns by having healthcare institutions collaborate through a federated learning paradigm.
We propose a comprehensive FL approach with adversarial debiasing and a fair aggregation method, suitable to various fairness metrics.
Our method has achieved promising fairness performance with the lowest impact on overall discrimination performance (accuracy)
arXiv Detail & Related papers (2023-05-19T02:03:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.