Multivariate Analysis on Performance Gaps of Artificial Intelligence
Models in Screening Mammography
- URL: http://arxiv.org/abs/2305.04422v3
- Date: Thu, 19 Oct 2023 18:03:11 GMT
- Title: Multivariate Analysis on Performance Gaps of Artificial Intelligence
Models in Screening Mammography
- Authors: Linglin Zhang, Beatrice Brown-Mulry, Vineela Nalla, InChan Hwang, Judy
Wawira Gichoya, Aimilia Gastounioti, Imon Banerjee, Laleh Seyyed-Kalantari,
MinJae Woo, Hari Trivedi
- Abstract summary: Deep learning models for abnormality classification can perform well in screening mammography.
The demographic, imaging, and clinical characteristics associated with increased risk of model failure remain unclear.
We assessed model performance by subgroups defined by age, race, pathologic outcome, tissue density, and imaging characteristics.
- Score: 4.123006816939975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although deep learning models for abnormality classification can perform well
in screening mammography, the demographic, imaging, and clinical
characteristics associated with increased risk of model failure remain unclear.
This retrospective study uses the Emory BrEast Imaging Dataset(EMBED)
containing mammograms from 115931 patients imaged at Emory Healthcare between
2013-2020, with BI-RADS assessment, region of interest coordinates for
abnormalities, imaging features, pathologic outcomes, and patient demographics.
Multiple deep learning models were trained to distinguish between abnormal
tissue patches and randomly selected normal tissue patches from screening
mammograms. We assessed model performance by subgroups defined by age, race,
pathologic outcome, tissue density, and imaging characteristics and
investigated their associations with false negatives (FN) and false positives
(FP). We also performed multivariate logistic regression to control for
confounding between subgroups. The top-performing model, ResNet152V2, achieved
accuracy of 92.6%(95%CI=92.0-93.2%), and AUC 0.975(95%CI=0.972-0.978). Before
controlling for confounding, nearly all subgroups showed statistically
significant differences in model performance. However, after controlling for
confounding, we found lower FN risk associates with Other
race(RR=0.828;p=.050), biopsy-proven benign lesions(RR=0.927;p=.011), and
mass(RR=0.921;p=.010) or asymmetry(RR=0.854;p=.040); higher FN risk associates
with architectural distortion (RR=1.037;p<.001). Higher FP risk associates to
BI-RADS density C(RR=1.891;p<.001) and D(RR=2.486;p<.001). Our results
demonstrate subgroup analysis is important in mammogram classifier performance
evaluation, and controlling for confounding between subgroups elucidates the
true associations between variables and model failure. These results can help
guide developing future breast cancer detection models.
Related papers
- A Demographic-Conditioned Variational Autoencoder for fMRI Distribution Sampling and Removal of Confounds [49.34500499203579]
We create a variational autoencoder (VAE)-based model, DemoVAE, to decorrelate fMRI features from demographics.
We generate high-quality synthetic fMRI data based on user-supplied demographics.
arXiv Detail & Related papers (2024-05-13T17:49:20Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Improving Fairness of Automated Chest X-ray Diagnosis by Contrastive
Learning [19.948079693716075]
Our proposed AI model utilizes supervised contrastive learning to minimize bias in CXR diagnosis.
We evaluated the methods on two datasets: the Medical Imaging and Data Resource Center (MIDRC) dataset with 77,887 CXR images and the NIH Chest X-ray dataset with 112,120 CXR images.
arXiv Detail & Related papers (2024-01-25T20:03:57Z) - Performance of externally validated machine learning models based on
histopathology images for the diagnosis, classification, prognosis, or
treatment outcome prediction in female breast cancer: A systematic review [0.5792122879054292]
externally validated machine learning models for diagnosis, classification, prognosis, or treatment outcome prediction in female breast cancer.
Three studies externally validated ML models for diagnosis, 4 for classification, 2 for prognosis, and 1 for both classification and prognosis.
Most studies used Convolutional Neural Networks and one used logistic regression algorithms.
arXiv Detail & Related papers (2023-12-09T18:27:56Z) - A Two-Stage Generative Model with CycleGAN and Joint Diffusion for
MRI-based Brain Tumor Detection [41.454028276986946]
We propose a novel framework Two-Stage Generative Model (TSGM) to improve brain tumor detection and segmentation.
CycleGAN is trained on unpaired data to generate abnormal images from healthy images as data prior.
VE-JP is implemented to reconstruct healthy images using synthetic paired abnormal images as a guide.
arXiv Detail & Related papers (2023-11-06T12:58:26Z) - TotalSegmentator: robust segmentation of 104 anatomical structures in CT
images [48.50994220135258]
We present a deep learning segmentation model for body CT images.
The model can segment 104 anatomical structures relevant for use cases such as organ volumetry, disease characterization, and surgical or radiotherapy planning.
arXiv Detail & Related papers (2022-08-11T15:16:40Z) - Automated SSIM Regression for Detection and Quantification of Motion
Artefacts in Brain MR Images [54.739076152240024]
Motion artefacts in magnetic resonance brain images are a crucial issue.
The assessment of MR image quality is fundamental before proceeding with the clinical diagnosis.
An automated image quality assessment based on the structural similarity index (SSIM) regression has been proposed here.
arXiv Detail & Related papers (2022-06-14T10:16:54Z) - StRegA: Unsupervised Anomaly Detection in Brain MRIs using a Compact
Context-encoding Variational Autoencoder [48.2010192865749]
Unsupervised anomaly detection (UAD) can learn a data distribution from an unlabelled dataset of healthy subjects and then be applied to detect out of distribution samples.
This research proposes a compact version of the "context-encoding" VAE (ceVAE) model, combined with pre and post-processing steps, creating a UAD pipeline (StRegA)
The proposed pipeline achieved a Dice score of 0.642$pm$0.101 while detecting tumours in T2w images of the BraTS dataset and 0.859$pm$0.112 while detecting artificially induced anomalies.
arXiv Detail & Related papers (2022-01-31T14:27:35Z) - SCREENet: A Multi-view Deep Convolutional Neural Network for
Classification of High-resolution Synthetic Mammographic Screening Scans [3.8137985834223502]
We develop and evaluate a multi-view deep learning approach to the analysis of high-resolution synthetic mammograms.
We assess the effect on accuracy of image resolution and training set size.
arXiv Detail & Related papers (2020-09-18T00:12:33Z) - Synthesizing lesions using contextual GANs improves breast cancer
classification on mammograms [0.4297070083645048]
We present a novel generative adversarial network (GAN) model for data augmentation that can realistically synthesize and remove lesions on mammograms.
With self-attention and semi-supervised learning components, the U-net-based architecture can generate high resolution (256x256px) outputs.
arXiv Detail & Related papers (2020-05-29T21:23:00Z) - An interpretable classifier for high-resolution breast cancer screening
images utilizing weakly supervised localization [45.00998416720726]
We propose a framework to address the unique properties of medical images.
This model first uses a low-capacity, yet memory-efficient, network on the whole image to identify the most informative regions.
It then applies another higher-capacity network to collect details from chosen regions.
Finally, it employs a fusion module that aggregates global and local information to make a final prediction.
arXiv Detail & Related papers (2020-02-13T15:28:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.