Related papers: Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management

Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management

URL: http://arxiv.org/abs/2108.01764v1
Date: Tue, 3 Aug 2021 21:55:28 GMT
Title: Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management
Authors: C\'ecile Log\'e, Emily Ross, David Yaw Amoah Dadey, Saahil Jain, Adriel Saporta, Andrew Y. Ng, Pranav Rajpurkar
Abstract summary: We introduce Q-Pain, a dataset for assessing bias in medical QA in the context of pain management. We propose a new, rigorous framework, including a sample experimental design, to measure the potential biases present when making treatment decisions.
Score: 5.044336341666555
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in Natural Language Processing (NLP), and specifically automated Question Answering (QA) systems, have demonstrated both impressive linguistic fluency and a pernicious tendency to reflect social biases. In this study, we introduce Q-Pain, a dataset for assessing bias in medical QA in the context of pain management, one of the most challenging forms of clinical decision-making. Along with the dataset, we propose a new, rigorous framework, including a sample experimental design, to measure the potential biases present when making treatment decisions. We demonstrate its use by assessing two reference Question-Answering systems, GPT-2 and GPT-3, and find statistically significant differences in treatment between intersectional race-gender subgroups, thus reaffirming the risks posed by AI in medical settings, and the need for datasets like ours to ensure safety before medical AI applications are deployed.

Related papers

Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems [4.031787614742573]
This study systematically evaluates demographic biases within medical RAG pipelines across multiple QA benchmarks. We implement and compare several bias mitigation strategies to address identified biases, including Chain of Thought reasoning, Counterfactual filtering, Adversarial prompt refinement, and Majority Vote aggregation.
arXiv Detail & Related papers (2025-03-19T17:36:35Z)
Uncertainty-aware abstention in medical diagnosis based on medical texts [87.88110503208016]
This study addresses the critical issue of reliability for AI-assisted medical diagnosis. We focus on the selection prediction approach that allows the diagnosis system to abstain from providing the decision if it is not confident in the diagnosis. We introduce HUQ-2, a new state-of-the-art method for enhancing reliability in selective prediction tasks.
arXiv Detail & Related papers (2025-02-25T10:15:21Z)
Moving Beyond Medical Exam Questions: A Clinician-Annotated Dataset of Real-World Tasks and Ambiguity in Mental Healthcare [0.0545520830707066]
We present an expert-created and annotated dataset spanning five critical domains of decision-making in mental healthcare. This dataset is designed to capture the nuanced clinical reasoning and daily ambiguities mental health practitioners encounter.
arXiv Detail & Related papers (2025-02-22T03:10:16Z)
Give me Some Hard Questions: Synthetic Data Generation for Clinical QA [13.436187152293515]
This paper explores generating Clinical QA data using large language models (LLMs) in a zero-shot setting. We find that naive prompting often results in easy questions that do not reflect the complexity of clinical scenarios. Experiments on two Clinical QA datasets demonstrate that our method generates more challenging questions, significantly improving fine-tuning performance over baselines.
arXiv Detail & Related papers (2024-12-05T19:35:41Z)
RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions [3.182594503527438]
We present RealMedQA, a dataset of realistic clinical questions generated by humans and an LLM. We show that the LLM is more cost-efficient for generating "ideal" QA pairs.
arXiv Detail & Related papers (2024-08-16T09:32:43Z)
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond [52.246494389096654]
This paper introduces Word-Sequence Entropy (WSE), a method that calibrates uncertainty at both the word and sequence levels. We compare WSE with six baseline methods on five free-form medical QA datasets, utilizing seven popular large language models (LLMs)
arXiv Detail & Related papers (2024-02-22T03:46:08Z)
XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering [1.1867812760085572]
We introduce a novel approach, XAIQA, for generating synthetic QA pairs at scale from data naturally available in electronic health records. Our method uses the idea of a classification model explainer to generate questions and answers about medical concepts corresponding to medical codes.
arXiv Detail & Related papers (2023-12-06T15:59:06Z)
An AI-Guided Data Centric Strategy to Detect and Mitigate Biases in Healthcare Datasets [32.25265709333831]
We generate a data-centric, model-agnostic, task-agnostic approach to evaluate dataset bias by investigating the relationship between how easily different groups are learned at small sample sizes (AEquity) We then apply a systematic analysis of AEq values across subpopulations to identify and manifestations of racial bias in two known cases in healthcare. AEq is a novel and broadly applicable metric that can be applied to advance equity by diagnosing and remediating bias in healthcare datasets.
arXiv Detail & Related papers (2023-11-06T17:08:41Z)
Adaptive questionnaires for facilitating patient data entry in clinical decision support systems: Methods and application to STOPP/START v2 [1.8374319565577155]
We propose an original solution to simplify patient data entry using an adaptive questionnaire. Considering a rule-based decision support systems, we designed methods for translating the system's clinical rules into display rules. We show that it permits reducing by about two thirds the number of clinical conditions displayed in the questionnaire.
arXiv Detail & Related papers (2023-09-19T07:59:13Z)
Informing clinical assessment by contextualizing post-hoc explanations of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state. We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability. Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z)
The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making. The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z)
Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex Healthcare Question Answering [89.76059961309453]
HeadQA dataset contains multiple-choice questions authorized for the public healthcare specialization exam. These questions are the most challenging for current QA systems. We present a Multi-step reasoning with Knowledge extraction framework (MurKe) We are striving to make full use of off-the-shelf pre-trained models.
arXiv Detail & Related papers (2020-08-06T02:47:46Z)
Hemogram Data as a Tool for Decision-making in COVID-19 Management: Applications to Resource Scarcity Scenarios [62.997667081978825]
COVID-19 pandemics has challenged emergency response systems worldwide, with widespread reports of essential services breakdown and collapse of health care structure. This work describes a machine learning model derived from hemogram exam data performed in symptomatic patients. Proposed models can predict COVID-19 qRT-PCR results in symptomatic individuals with high accuracy, sensitivity and specificity.
arXiv Detail & Related papers (2020-05-10T01:45:03Z)
Towards Causality-Aware Inferring: A Sequential Discriminative Approach for Medical Diagnosis [142.90770786804507]
Medical diagnosis assistant (MDA) aims to build an interactive diagnostic agent to sequentially inquire about symptoms for discriminating diseases. This work attempts to address these critical issues in MDA by taking advantage of the causal diagram. We propose a propensity-based patient simulator to effectively answer unrecorded inquiry by drawing knowledge from the other records.
arXiv Detail & Related papers (2020-03-14T02:05:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.