Bridging the Trust Gap: Clinician-Validated Hybrid Explainable AI for Maternal Health Risk Assessment in Bangladesh
- URL: http://arxiv.org/abs/2601.07866v1
- Date: Sat, 10 Jan 2026 16:12:38 GMT
- Title: Bridging the Trust Gap: Clinician-Validated Hybrid Explainable AI for Maternal Health Risk Assessment in Bangladesh
- Authors: Farjana Yesmin, Nusrat Shirmin, Suraiya Shabnam Bristy,
- Abstract summary: This study presents a hybrid explainable AI framework combining ante-hoc fuzzy logic with post-hoc SHAP explanations.<n>We developed a fuzzy-XGBoost model on 1,014 maternal health records, achieving 88.67% accuracy.<n>A validation study with 14 healthcare professionals in Bangladesh revealed strong preference for hybrid explanations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While machine learning shows promise for maternal health risk prediction, clinical adoption in resource-constrained settings faces a critical barrier: lack of explainability and trust. This study presents a hybrid explainable AI (XAI) framework combining ante-hoc fuzzy logic with post-hoc SHAP explanations, validated through systematic clinician feedback. We developed a fuzzy-XGBoost model on 1,014 maternal health records, achieving 88.67% accuracy (ROC-AUC: 0.9703). A validation study with 14 healthcare professionals in Bangladesh revealed strong preference for hybrid explanations (71.4% across three clinical cases) with 54.8% expressing trust for clinical use. SHAP analysis identified healthcare access as the primary predictor, with the engineered fuzzy risk score ranking third, validating clinical knowledge integration (r=0.298). Clinicians valued integrated clinical parameters but identified critical gaps: obstetric history, gestational age, and connectivity barriers. This work demonstrates that combining interpretable fuzzy rules with feature importance explanations enhances both utility and trust, providing practical insights for XAI deployment in maternal healthcare.
Related papers
- Towards Reliable Medical LLMs: Benchmarking and Enhancing Confidence Estimation of Large Language Models in Medical Consultation [97.36081721024728]
We propose the first benchmark for assessing confidence in multi-turn interaction during realistic medical consultations.<n>Our benchmark unifies three types of medical data for open-ended diagnostic generation.<n>We present MedConf, an evidence-grounded linguistic self-assessment framework.
arXiv Detail & Related papers (2026-01-22T04:51:39Z) - Knowledge-Guided Large Language Model for Automatic Pediatric Dental Record Understanding and Safe Antibiotic Recommendation [0.4779196219827507]
This study proposes a Knowledge-Guided Large Language Model (KG-LLM)<n>It integrates a pediatric dental knowledge graph, retrieval-augmented generation (RAG), and a multi-stage safety validation pipeline for evidence-grounded antibiotic recommendation.<n> Experiments on 32,000 de-identified pediatric dental visit records demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2025-12-09T21:11:55Z) - Beyond Traditional Diagnostics: Transforming Patient-Side Information into Predictive Insights with Knowledge Graphs and Prototypes [55.310195121276074]
We propose a Knowledge graph-enhanced, Prototype-aware, and Interpretable (KPI) framework to predict diseases.<n>It integrates structured and trusted medical knowledge into a unified disease knowledge graph, constructs clinically meaningful disease prototypes, and employs contrastive learning to enhance predictive accuracy.<n>It provides clinically valid explanations that closely align with patient narratives, highlighting its practical value for patient-centered healthcare delivery.
arXiv Detail & Related papers (2025-12-09T05:37:54Z) - Diagnosing Hallucination Risk in AI Surgical Decision-Support: A Sequential Framework for Sequential Validation [5.469454486414467]
Large language models (LLMs) offer transformative potential for clinical decision support in spine surgery.<n>LLMs pose significant risks through hallucinations, which are factually inconsistent or contextually misaligned outputs.<n>This study introduces a clinician-centered framework to quantify hallucination risks by evaluating diagnostic precision, recommendation quality, reasoning robustness, output coherence, and knowledge alignment.
arXiv Detail & Related papers (2025-11-01T15:25:55Z) - Medical priority fusion: achieving dual optimization of sensitivity and interpretability in nipt anomaly detection [0.0]
Clinical machine learning faces a critical dilemma in high-stakes medical applications.<n>A paradox becomes particularly acute in non-invasive prenatal testing (NIPT), where missed chromosomal abnormalities carry profound clinical consequences.<n>We introduce Medical Priority Fusion (MPF), a constrained multi-objective optimization framework that resolves this fundamental trade-off.
arXiv Detail & Related papers (2025-09-22T15:49:20Z) - Reducing Large Language Model Safety Risks in Women's Health using Semantic Entropy [29.14930590607661]
Large language models (LLMs) generate false or misleading outputs, known as hallucinations.<n>Traditional methods for quantifying uncertainty, such as perplexity, fail to capture meaning-level inconsistencies that lead to misinformation.<n>We evaluate semantic entropy (SE), a novel uncertainty metric, to detect hallucinations in AI-generated medical content.
arXiv Detail & Related papers (2025-03-01T00:57:52Z) - Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.<n>Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration [8.36613277875556]
High-risk pregnancy is a pregnancy complicated by factors that can adversely affect the outcomes of the mother or the infant.
This work presents the implementation of a real-world ML-based system to assist care managers in identifying pregnant patients at risk of complications.
arXiv Detail & Related papers (2023-05-26T21:08:49Z) - AutoTrial: Prompting Language Models for Clinical Trial Design [53.630479619856516]
We present a method named AutoTrial to aid the design of clinical eligibility criteria using language models.
Experiments on over 70K clinical trials verify that AutoTrial generates high-quality criteria texts.
arXiv Detail & Related papers (2023-05-19T01:04:16Z) - Informing clinical assessment by contextualizing post-hoc explanations
of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state.
We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability.
Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.