Rethinking Machine Learning Model Evaluation in Pathology
- URL: http://arxiv.org/abs/2204.05205v1
- Date: Mon, 11 Apr 2022 15:49:12 GMT
- Title: Rethinking Machine Learning Model Evaluation in Pathology
- Authors: Syed Ashar Javed, Dinkar Juyal, Zahil Shanis, Shreya Chakraborty,
Harsha Pokkalla, Aaditya Prakash
- Abstract summary: We propose a set of practical guidelines for Machine Learning evaluation in pathology.
The paper includes measures for setting up the evaluation framework, effectively dealing with variability in labels, and a recommended suite of tests.
We hope that the proposed framework will bridge the gap between ML researchers and domain experts, leading to wider adoption of ML techniques in pathology.
- Score: 3.0575251867964153
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine Learning has been applied to pathology images in research and
clinical practice with promising outcomes. However, standard ML models often
lack the rigorous evaluation required for clinical decisions. Machine learning
techniques for natural images are ill-equipped to deal with pathology images
that are significantly large and noisy, require expensive labeling, are hard to
interpret, and are susceptible to spurious correlations. We propose a set of
practical guidelines for ML evaluation in pathology that address the above
concerns. The paper includes measures for setting up the evaluation framework,
effectively dealing with variability in labels, and a recommended suite of
tests to address issues related to domain shift, robustness, and confounding
variables. We hope that the proposed framework will bridge the gap between ML
researchers and domain experts, leading to wider adoption of ML techniques in
pathology and improving patient outcomes.
Related papers
- Stronger Baseline Models -- A Key Requirement for Aligning Machine Learning Research with Clinical Utility [0.0]
Well-known barriers exist when attempting to deploy Machine Learning models in high-stakes, clinical settings.
We show empirically that including stronger baseline models in evaluations has important downstream effects.
We propose some best practices that will enable practitioners to more effectively study and deploy ML models in clinical settings.
arXiv Detail & Related papers (2024-09-18T16:38:37Z) - MedISure: Towards Assuring Machine Learning-based Medical Image
Classifiers using Mixup Boundary Analysis [3.1256597361013725]
Machine learning (ML) models are becoming integral in healthcare technologies.
Traditional software assurance techniques rely on fixed code and do not directly apply to ML models.
We present a novel technique called Mix-Up Boundary Analysis (MUBA) that facilitates evaluating image classifiers in terms of prediction fairness.
arXiv Detail & Related papers (2023-11-23T12:47:43Z) - Data Augmentation-Based Unsupervised Domain Adaptation In Medical
Imaging [0.709016563801433]
We propose an unsupervised method for robust domain adaptation in brain MRI segmentation by leveraging MRI-specific augmentation techniques.
The results show that our proposed approach achieves high accuracy, exhibits broad applicability, and showcases remarkable robustness against domain shift in various tasks.
arXiv Detail & Related papers (2023-08-08T17:00:11Z) - Detecting Shortcut Learning for Fair Medical AI using Shortcut Testing [62.9062883851246]
Machine learning holds great promise for improving healthcare, but it is critical to ensure that its use will not propagate or amplify health disparities.
One potential driver of algorithmic unfairness, shortcut learning, arises when ML models base predictions on improper correlations in the training data.
Using multi-task learning, we propose the first method to assess and mitigate shortcut learning as a part of the fairness assessment of clinical ML systems.
arXiv Detail & Related papers (2022-07-21T09:35:38Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - REET: Robustness Evaluation and Enhancement Toolbox for Computational
Pathology [1.452875650827562]
We propose the first domain-specific Robustness Evaluation and Enhancement Toolbox (REET) for computational pathology applications.
REET provides a suite of algorithmic strategies for enabling robustness assessment of predictive models.
REET also enables efficient and robust training of deep learning pipelines in computational pathology.
arXiv Detail & Related papers (2022-01-28T18:23:55Z) - Demystifying Deep Learning Models for Retinal OCT Disease Classification
using Explainable AI [0.6117371161379209]
The adoption of various deep learning techniques is quite common as well as effective, and its statement is equally true when it comes to implementing it into the retina Optical Coherence Tomography sector.
These techniques have the black box characteristics that prevent the medical professionals to completely trust the results generated from them.
This paper proposes a self-developed CNN model which is comparatively smaller and simpler along with the use of Lime that introduces Explainable AI to the study.
arXiv Detail & Related papers (2021-11-06T13:54:07Z) - Learning Binary Semantic Embedding for Histology Image Classification
and Retrieval [56.34863511025423]
We propose a novel method for Learning Binary Semantic Embedding (LBSE)
Based on the efficient and effective embedding, classification and retrieval are performed to provide interpretable computer-assisted diagnosis for histology images.
Experiments conducted on three benchmark datasets validate the superiority of LBSE under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:36:44Z) - Collaborative Unsupervised Domain Adaptation for Medical Image Diagnosis [102.40869566439514]
We seek to exploit rich labeled data from relevant domains to help the learning in the target task via Unsupervised Domain Adaptation (UDA)
Unlike most UDA methods that rely on clean labeled data or assume samples are equally transferable, we innovatively propose a Collaborative Unsupervised Domain Adaptation algorithm.
We theoretically analyze the generalization performance of the proposed method, and also empirically evaluate it on both medical and general images.
arXiv Detail & Related papers (2020-07-05T11:49:17Z) - Semi-supervised Medical Image Classification with Relation-driven
Self-ensembling Model [71.80319052891817]
We present a relation-driven semi-supervised framework for medical image classification.
It exploits the unlabeled data by encouraging the prediction consistency of given input under perturbations.
Our method outperforms many state-of-the-art semi-supervised learning methods on both single-label and multi-label image classification scenarios.
arXiv Detail & Related papers (2020-05-15T06:57:54Z) - Weakly supervised multiple instance learning histopathological tumor
segmentation [51.085268272912415]
We propose a weakly supervised framework for whole slide imaging segmentation.
We exploit a multiple instance learning scheme for training models.
The proposed framework has been evaluated on multi-locations and multi-centric public data from The Cancer Genome Atlas and the PatchCamelyon dataset.
arXiv Detail & Related papers (2020-04-10T13:12:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.