Beyond Perfect Scores: Proof-by-Contradiction for Trustworthy Machine Learning
- URL: http://arxiv.org/abs/2601.06704v1
- Date: Sat, 10 Jan 2026 22:08:14 GMT
- Title: Beyond Perfect Scores: Proof-by-Contradiction for Trustworthy Machine Learning
- Authors: Dushan N. Wadduwage, Dineth Jayakody, Leonidas Zimianitis,
- Abstract summary: It is often unclear whether a model relies on true clinical cues or on spurious correlations in the data.<n>This paper introduces a simple yet broadly applicable trustworthiness test grounded in proof-by-contradiction.<n>Our approach trains and tests on spurious labels carefully permuted based on a potential outcomes framework.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) models show strong promise for new biomedical prediction tasks, but concerns about trustworthiness have hindered their clinical adoption. In particular, it is often unclear whether a model relies on true clinical cues or on spurious hierarchical correlations in the data. This paper introduces a simple yet broadly applicable trustworthiness test grounded in stochastic proof-by-contradiction. Instead of just showing high test performance, our approach trains and tests on spurious labels carefully permuted based on a potential outcomes framework. A truly trustworthy model should fail under such label permutation; comparable accuracy across real and permuted labels indicates overfitting, shortcut learning, or data leakage. Our approach quantifies this behavior through interpretable Fisher-style p-values, which are well understood by domain experts across medical and life sciences. We evaluate our approach on multiple new bacterial diagnostics to separate tasks and models learning genuine causal relationships from those driven by dataset artifacts or statistical coincidences. Our work establishes a foundation to build rigor and trust between ML and life-science research communities, moving ML models one step closer to clinical adoption.
Related papers
- Towards a Certificate of Trust: Task-Aware OOD Detection for Scientific AI [18.927559053107842]
We propose a new OOD detection method based on estimating joint likelihoods using a score-based diffusion model.<n>This approach considers not just the input but also the regression model's prediction, providing a task-aware reliability score.<n>Our work provides a foundational step towards building a 'certificate of trust', thereby offering a practical tool for assessing the trustworthiness of AI-based predictions.
arXiv Detail & Related papers (2025-09-29T17:21:25Z) - Label Uncertainty for Ultrasound Segmentation [25.682215047694168]
In medical imaging, inter-observer variability among radiologists often introduces label uncertainty.<n>We introduce a novel approach to both labeling and training AI models using expert-supplied, per-pixel confidence values.
arXiv Detail & Related papers (2025-08-21T15:00:21Z) - Demystifying amortized causal discovery with transformers [21.058343547918053]
Supervised learning approaches for causal discovery from observational data often achieve competitive performance.<n>In this work, we investigate CSIvA, a transformer-based model promising to train on synthetic data and transfer to real data.<n>We bridge the gap with existing identifiability theory and show that constraints on the training data distribution implicitly define a prior on the test observations.
arXiv Detail & Related papers (2024-05-27T08:17:49Z) - Inadequacy of common stochastic neural networks for reliable clinical
decision support [0.4262974002462632]
Widespread adoption of AI for medical decision making is still hindered due to ethical and safety-related concerns.
Common deep learning approaches, however, have the tendency towards overconfidence under data shift.
This study investigates their actual reliability in clinical applications.
arXiv Detail & Related papers (2024-01-24T18:49:30Z) - Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals [91.59906995214209]
We propose a new evaluation method, Counterfactual Attentiveness Test (CAT)
CAT uses counterfactuals by replacing part of the input with its counterpart from a different example, expecting an attentive model to change its prediction.
We show that GPT3 becomes less attentive with an increased number of demonstrations, while its accuracy on the test data improves.
arXiv Detail & Related papers (2023-11-16T06:27:35Z) - MELEP: A Novel Predictive Measure of Transferability in Multi-Label ECG Diagnosis [1.3654846342364306]
We introduce MELEP, a measure designed to estimate the effectiveness of knowledge transfer from a pre-trained model to a downstream ECG diagnosis task.
Our experiments show that MELEP can predict the performance of pre-trained convolutional and recurrent deep neural networks, on small and imbalanced ECG data.
arXiv Detail & Related papers (2023-10-27T14:57:10Z) - Explicit Tradeoffs between Adversarial and Natural Distributional
Robustness [48.44639585732391]
In practice, models need to enjoy both types of robustness to ensure reliability.
In this work, we show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness.
arXiv Detail & Related papers (2022-09-15T19:58:01Z) - Conformal Prediction Under Feedback Covariate Shift for Biomolecular Design [56.86533144730384]
We introduce a method to quantify predictive uncertainty in settings where the training and test data are statistically dependent.<n>As a motivating use case, we demonstrate with several real data sets how our method quantifies uncertainty for the predicted fitness of designed proteins.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Reliable and Trustworthy Machine Learning for Health Using Dataset Shift
Detection [7.263558963357268]
Unpredictable ML model behavior on unseen data, especially in the health domain, raises serious concerns about its safety.
We show that Mahalanobis distance- and Gram matrices-based out-of-distribution detection methods are able to detect out-of-distribution data with high accuracy.
We then translate the out-of-distribution score into a human interpretable CONFIDENCE SCORE to investigate its effect on the users' interaction with health ML applications.
arXiv Detail & Related papers (2021-10-26T20:49:01Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.