Detecting Spurious Correlations with Sanity Tests for Artificial
Intelligence Guided Radiology Systems
- URL: http://arxiv.org/abs/2103.03048v1
- Date: Thu, 4 Mar 2021 14:14:05 GMT
- Title: Detecting Spurious Correlations with Sanity Tests for Artificial
Intelligence Guided Radiology Systems
- Authors: Usman Mahmood, Robik Shrestha, David D.B. Bates, Lorenzo Mannelli,
Giuseppe Corrias, Yusuf Erdi, Christopher Kanan
- Abstract summary: A critical component to deploying AI in radiology is to gain confidence in a developed system's efficacy and safety.
The current gold standard approach is to conduct an analytical validation of performance on a generalization dataset.
We describe a series of sanity tests to identify when a system performs well on development data for the wrong reasons.
- Score: 22.249702822013045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial intelligence (AI) has been successful at solving numerous problems
in machine perception. In radiology, AI systems are rapidly evolving and show
progress in guiding treatment decisions, diagnosing, localizing disease on
medical images, and improving radiologists' efficiency. A critical component to
deploying AI in radiology is to gain confidence in a developed system's
efficacy and safety. The current gold standard approach is to conduct an
analytical validation of performance on a generalization dataset from one or
more institutions, followed by a clinical validation study of the system's
efficacy during deployment. Clinical validation studies are time-consuming, and
best practices dictate limited re-use of analytical validation data, so it is
ideal to know ahead of time if a system is likely to fail analytical or
clinical validation. In this paper, we describe a series of sanity tests to
identify when a system performs well on development data for the wrong reasons.
We illustrate the sanity tests' value by designing a deep learning system to
classify pancreatic cancer seen in computed tomography scans.
Related papers
- TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - A Survey of Artificial Intelligence in Gait-Based Neurodegenerative Disease Diagnosis [51.07114445705692]
neurodegenerative diseases (NDs) traditionally require extensive healthcare resources and human effort for medical diagnosis and monitoring.
As a crucial disease-related motor symptom, human gait can be exploited to characterize different NDs.
The current advances in artificial intelligence (AI) models enable automatic gait analysis for NDs identification and classification.
arXiv Detail & Related papers (2024-05-21T06:44:40Z) - CopilotCAD: Empowering Radiologists with Report Completion Models and Quantitative Evidence from Medical Image Foundation Models [3.8940162151291804]
This study introduces an innovative paradigm to create an assistive co-pilot system for empowering radiologists.
We develop a collaborative framework to integrate Large Language Models (LLMs) and medical image analysis tools.
arXiv Detail & Related papers (2024-04-11T01:33:45Z) - The Limits of Perception: Analyzing Inconsistencies in Saliency Maps in XAI [0.0]
Explainable artificial intelligence (XAI) plays an indispensable role in demystifying the decision-making processes of AI.
As they operate as "black boxes," with their reasoning obscured and inaccessible, there's an increased risk of misdiagnosis.
This shift towards transparency is not just beneficial -- it's a critical step towards responsible AI integration in healthcare.
arXiv Detail & Related papers (2024-03-23T02:15:23Z) - Detecting algorithmic bias in medical-AI models using trees [7.939586935057782]
This paper presents an innovative framework for detecting areas of algorithmic bias in medical-AI decision support systems.
Our approach efficiently identifies potential biases in medical-AI models, specifically in the context of sepsis prediction.
arXiv Detail & Related papers (2023-12-05T18:47:34Z) - Explainable AI in Diagnosing and Anticipating Leukemia Using Transfer
Learning Method [0.0]
This research paper focuses on Acute Lymphoblastic Leukemia (ALL), a form of blood cancer prevalent in children and teenagers.
It proposes an automated detection approach using computer-aided diagnostic (CAD) models, leveraging deep learning techniques.
The proposed method achieved an impressive 98.38% accuracy, outperforming other tested models.
arXiv Detail & Related papers (2023-12-01T10:37:02Z) - Recent advancement in Disease Diagnostic using machine learning:
Systematic survey of decades, comparisons, and challenges [0.0]
Pattern recognition and machine learning in the biomedical area promise to increase the precision of disease detection and diagnosis.
This review article examines machine-learning algorithms for detecting diseases, including hepatitis, diabetes, liver disease, dengue fever, and heart disease.
arXiv Detail & Related papers (2023-07-31T16:35:35Z) - Deep Reinforcement Learning Framework for Thoracic Diseases
Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis.
We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents.
Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z) - Detecting Shortcut Learning for Fair Medical AI using Shortcut Testing [62.9062883851246]
Machine learning holds great promise for improving healthcare, but it is critical to ensure that its use will not propagate or amplify health disparities.
One potential driver of algorithmic unfairness, shortcut learning, arises when ML models base predictions on improper correlations in the training data.
Using multi-task learning, we propose the first method to assess and mitigate shortcut learning as a part of the fairness assessment of clinical ML systems.
arXiv Detail & Related papers (2022-07-21T09:35:38Z) - NeuralSympCheck: A Symptom Checking and Disease Diagnostic Neural Model
with Logic Regularization [59.15047491202254]
symptom checking systems inquire users for their symptoms and perform a rapid and affordable medical assessment of their condition.
We propose a new approach based on the supervised learning of neural models with logic regularization.
Our experiments show that the proposed approach outperforms the best existing methods in the accuracy of diagnosis when the number of diagnoses and symptoms is large.
arXiv Detail & Related papers (2022-06-02T07:57:17Z) - Inheritance-guided Hierarchical Assignment for Clinical Automatic
Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making.
We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.