Related papers: Machine Learning for Medicine Must Be Interpretable, Shareable, Reproducible and Accountable by Design

Machine Learning for Medicine Must Be Interpretable, Shareable, Reproducible and Accountable by Design

URL: http://arxiv.org/abs/2508.16097v1
Date: Fri, 22 Aug 2025 05:23:34 GMT
Title: Machine Learning for Medicine Must Be Interpretable, Shareable, Reproducible and Accountable by Design
Authors: Ayyüce Begüm Bektaş, Mithat Gönen,
Abstract summary: We argue that these principles should form the foundational design criteria for machine learning algorithms in medicine.<n>We discuss how intrinsically interpretable modeling approaches can serve as powerful alternatives to opaque deep networks.<n>We then examine accountability in model development, calling for rigorous evaluation, fairness, and uncertainty quantification.
Score: 0.12891210250935148
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper claims that machine learning models deployed in high stakes domains such as medicine must be interpretable, shareable, reproducible and accountable. We argue that these principles should form the foundational design criteria for machine learning algorithms dealing with critical medical data, including survival analysis and risk prediction tasks. Black box models, while often highly accurate, struggle to gain trust and regulatory approval in health care due to a lack of transparency. We discuss how intrinsically interpretable modeling approaches (such as kernel methods with sparsity, prototype-based learning, and deep kernel models) can serve as powerful alternatives to opaque deep networks, providing insight into biomedical predictions. We then examine accountability in model development, calling for rigorous evaluation, fairness, and uncertainty quantification to ensure models reliably support clinical decisions. Finally, we explore how generative AI and collaborative learning paradigms (such as federated learning and diffusion-based data synthesis) enable reproducible research and cross-institutional integration of heterogeneous biomedical data without compromising privacy, hence shareability. By rethinking machine learning foundations along these axes, we can develop medical AI that is not only accurate but also transparent, trustworthy, and translatable to real-world clinical settings.

Related papers

Unlocking Biomedical Insights: Hierarchical Attention Networks for High-Dimensional Data Interpretation [0.3821469577674901]
Hierarchical Attention-based Interpretable Network (HAIN) is a novel architecture that unifies multi-level attention mechanisms, dimensionality reduction, and explanation-driven loss functions.<n> Comprehensive evaluation on The Cancer Genome Atlas dataset demonstrates that HAIN achieves a classification accuracy of 94.3%.<n>HAIN effectively identifies biologically relevant cancer biomarkers, supporting its utility for clinical and research applications.
arXiv Detail & Related papers (2025-10-21T20:08:50Z)
Interpretable Clinical Classification with Kolgomorov-Arnold Networks [70.72819760172744]
Kolmogorov-Arnold Networks (KANs) offer intrinsic interpretability through transparent, symbolic representations.<n>KANs support built-in patient-level insights, intuitive visualizations, and nearest-patient retrieval.<n>These results position KANs as a promising step toward trustworthy AI that clinicians can understand, audit, and act upon.
arXiv Detail & Related papers (2025-09-20T17:21:58Z)
The challenge of uncertainty quantification of large language models in medicine [0.0]
This study investigates uncertainty quantification in large language models (LLMs) for medical applications.<n>Our research frames uncertainty not as a barrier but as an essential part of knowledge that invites a dynamic and reflective approach to AI design.
arXiv Detail & Related papers (2025-04-07T17:24:11Z)
Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z)
Bayesian Kolmogorov Arnold Networks (Bayesian_KANs): A Probabilistic Approach to Enhance Accuracy and Interpretability [1.90365714903665]
This study presents a novel framework called Bayesian Kolmogorov Arnold Networks (BKANs) BKANs combines the expressive capacity of Kolmogorov Arnold Networks with Bayesian inference. Our method provides useful insights into prediction confidence and decision boundaries and outperforms traditional deep learning models in terms of prediction accuracy.
arXiv Detail & Related papers (2024-08-05T10:38:34Z)
MedISure: Towards Assuring Machine Learning-based Medical Image Classifiers using Mixup Boundary Analysis [3.1256597361013725]
Machine learning (ML) models are becoming integral in healthcare technologies. Traditional software assurance techniques rely on fixed code and do not directly apply to ML models. We present a novel technique called Mix-Up Boundary Analysis (MUBA) that facilitates evaluating image classifiers in terms of prediction fairness.
arXiv Detail & Related papers (2023-11-23T12:47:43Z)
COVID-Net Biochem: An Explainability-driven Framework to Building Machine Learning Models for Predicting Survival and Kidney Injury of COVID-19 Patients from Clinical and Biochemistry Data [66.43957431843324]
We introduce COVID-Net Biochem, a versatile and explainable framework for constructing machine learning models. We apply this framework to predict COVID-19 patient survival and the likelihood of developing Acute Kidney Injury during hospitalization.
arXiv Detail & Related papers (2022-04-24T07:38:37Z)
Convolutional Motif Kernel Networks [1.104960878651584]
We show that our model is able to robustly learn on small datasets and reaches state-of-the-art performance on relevant healthcare prediction tasks. Our proposed method can be utilized on DNA and protein sequences.
arXiv Detail & Related papers (2021-11-03T15:06:09Z)
The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making. The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z)
Estimating and Improving Fairness with Adversarial Learning [65.99330614802388]
We propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model. We evaluate our framework on a large-scale public-available skin lesion dataset.
arXiv Detail & Related papers (2021-03-07T03:10:32Z)
Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning. It aims to extract both the common information and the complementary information in an adversarial setting. In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z)
Key Technology Considerations in Developing and Deploying Machine Learning Models in Clinical Radiology Practice [0.0]
We propose a list of key considerations that machine learning researchers must recognize and address to make their models accurate, robust, and usable in practice. Namely, we discuss: insufficient training data, decentralized datasets, high cost of annotations, ambiguous ground truth, imbalance in class representation, asymmetric misclassification costs, relevant performance metrics, generalization of models to unseen datasets, model decay, adversarial attacks, explainability, fairness and bias, and clinical validation.
arXiv Detail & Related papers (2021-02-03T09:53:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.