Related papers: Multivariate Analysis on Performance Gaps of Artificial Intelligence Models in Screening Mammography

Multivariate Analysis on Performance Gaps of Artificial Intelligence Models in Screening Mammography

URL: http://arxiv.org/abs/2305.04422v3
Date: Thu, 19 Oct 2023 18:03:11 GMT
Title: Multivariate Analysis on Performance Gaps of Artificial Intelligence Models in Screening Mammography
Authors: Linglin Zhang, Beatrice Brown-Mulry, Vineela Nalla, InChan Hwang, Judy Wawira Gichoya, Aimilia Gastounioti, Imon Banerjee, Laleh Seyyed-Kalantari, MinJae Woo, Hari Trivedi
Abstract summary: Deep learning models for abnormality classification can perform well in screening mammography. The demographic, imaging, and clinical characteristics associated with increased risk of model failure remain unclear. We assessed model performance by subgroups defined by age, race, pathologic outcome, tissue density, and imaging characteristics.
Score: 4.123006816939975
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although deep learning models for abnormality classification can perform well in screening mammography, the demographic, imaging, and clinical characteristics associated with increased risk of model failure remain unclear. This retrospective study uses the Emory BrEast Imaging Dataset(EMBED) containing mammograms from 115931 patients imaged at Emory Healthcare between 2013-2020, with BI-RADS assessment, region of interest coordinates for abnormalities, imaging features, pathologic outcomes, and patient demographics. Multiple deep learning models were trained to distinguish between abnormal tissue patches and randomly selected normal tissue patches from screening mammograms. We assessed model performance by subgroups defined by age, race, pathologic outcome, tissue density, and imaging characteristics and investigated their associations with false negatives (FN) and false positives (FP). We also performed multivariate logistic regression to control for confounding between subgroups. The top-performing model, ResNet152V2, achieved accuracy of 92.6%(95%CI=92.0-93.2%), and AUC 0.975(95%CI=0.972-0.978). Before controlling for confounding, nearly all subgroups showed statistically significant differences in model performance. However, after controlling for confounding, we found lower FN risk associates with Other race(RR=0.828;p=.050), biopsy-proven benign lesions(RR=0.927;p=.011), and mass(RR=0.921;p=.010) or asymmetry(RR=0.854;p=.040); higher FN risk associates with architectural distortion (RR=1.037;p<.001). Higher FP risk associates to BI-RADS density C(RR=1.891;p<.001) and D(RR=2.486;p<.001). Our results demonstrate subgroup analysis is important in mammogram classifier performance evaluation, and controlling for confounding between subgroups elucidates the true associations between variables and model failure. These results can help guide developing future breast cancer detection models.

Related papers

Subgroup Performance of a Commercial Digital Breast Tomosynthesis Model for Breast Cancer Detection [5.089670339445636]
This study presents a granular evaluation of the Lunit INSIGHT model on a large retrospective cohort of 163,449 screening mammography exams. Performance was found to be robust across demographics, but cases with non-invasive cancers were associated with significantly lower performance.
arXiv Detail & Related papers (2025-03-17T17:17:36Z)
Machine Learning-Based Model for Postoperative Stroke Prediction in Coronary Artery Disease [0.0]
This study aims to develop and evaluate a sophisticated machine learning prediction model to assess postoperative stroke risk. The dataset has 70% training and 30% test. Numerical values were normalized, whereas categorical variables were one-hot encoded. Logistic Regression, XGBoost, SVM, and CatBoost were employed for predictive modeling, and SHAP analysis assessed stroke risk for each variable.
arXiv Detail & Related papers (2025-03-15T02:50:32Z)
Patch-Based and Non-Patch-Based inputs Comparison into Deep Neural Models: Application for the Segmentation of Retinal Diseases on Optical Coherence Tomography Volumes [0.3749861135832073]
Approaching 170 million persons wide-ranging have been spotted with AMD, a figure anticipated to rise to 288 million by 2040. Deep learning networks have shown promising results in both image and pixel-level 2D scan classification. Highest score for a patch-based model in the DSC metric was 0.88 in comparison to the score of 0.71 for the same model in non-patch-based for SRF fluid segmentation.
arXiv Detail & Related papers (2025-01-22T10:22:08Z)
A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis [58.85247337449624]
We propose a knowledge-enhanced vision-language pre-training approach that integrates disease knowledge into the alignment within hierarchical semantic groups. KEEP achieves state-of-the-art performance in zero-shot cancer diagnostic tasks.
arXiv Detail & Related papers (2024-12-17T17:45:21Z)
CRTRE: Causal Rule Generation with Target Trial Emulation Framework [47.2836994469923]
We introduce a novel method called causal rule generation with target trial emulation framework (CRTRE) CRTRE applies randomize trial design principles to estimate the causal effect of association rules. We then incorporate such association rules for the downstream applications such as prediction of disease onsets.
arXiv Detail & Related papers (2024-11-10T02:40:06Z)
Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge [44.76736949127792]
We describe the design and results from the BraTS 2023 Intracranial Meningioma Challenge.<n>The BraTS Meningioma Challenge differed from prior BraTS Glioma challenges in that it focused on meningiomas.<n>The top ranked team had a lesion-wise median dice similarity coefficient (DSC) of 0.976, 0.976, and 0.964 for enhancing tumor, tumor core, and whole tumor.
arXiv Detail & Related papers (2024-05-16T03:23:57Z)
A Demographic-Conditioned Variational Autoencoder for fMRI Distribution Sampling and Removal of Confounds [49.34500499203579]
We create a variational autoencoder (VAE)-based model, DemoVAE, to decorrelate fMRI features from demographics. We generate high-quality synthetic fMRI data based on user-supplied demographics.
arXiv Detail & Related papers (2024-05-13T17:49:20Z)
Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals. Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z)
Improving Fairness of Automated Chest X-ray Diagnosis by Contrastive Learning [19.948079693716075]
Our proposed AI model utilizes supervised contrastive learning to minimize bias in CXR diagnosis. We evaluated the methods on two datasets: the Medical Imaging and Data Resource Center (MIDRC) dataset with 77,887 CXR images and the NIH Chest X-ray dataset with 112,120 CXR images.
arXiv Detail & Related papers (2024-01-25T20:03:57Z)
Performance of externally validated machine learning models based on histopathology images for the diagnosis, classification, prognosis, or treatment outcome prediction in female breast cancer: A systematic review [0.5792122879054292]
externally validated machine learning models for diagnosis, classification, prognosis, or treatment outcome prediction in female breast cancer. Three studies externally validated ML models for diagnosis, 4 for classification, 2 for prognosis, and 1 for both classification and prognosis. Most studies used Convolutional Neural Networks and one used logistic regression algorithms.
arXiv Detail & Related papers (2023-12-09T18:27:56Z)
A Two-Stage Generative Model with CycleGAN and Joint Diffusion for MRI-based Brain Tumor Detection [41.454028276986946]
We propose a novel framework Two-Stage Generative Model (TSGM) to improve brain tumor detection and segmentation. CycleGAN is trained on unpaired data to generate abnormal images from healthy images as data prior. VE-JP is implemented to reconstruct healthy images using synthetic paired abnormal images as a guide.
arXiv Detail & Related papers (2023-11-06T12:58:26Z)
TotalSegmentator: robust segmentation of 104 anatomical structures in CT images [48.50994220135258]
We present a deep learning segmentation model for body CT images. The model can segment 104 anatomical structures relevant for use cases such as organ volumetry, disease characterization, and surgical or radiotherapy planning.
arXiv Detail & Related papers (2022-08-11T15:16:40Z)
Automated SSIM Regression for Detection and Quantification of Motion Artefacts in Brain MR Images [54.739076152240024]
Motion artefacts in magnetic resonance brain images are a crucial issue. The assessment of MR image quality is fundamental before proceeding with the clinical diagnosis. An automated image quality assessment based on the structural similarity index (SSIM) regression has been proposed here.
arXiv Detail & Related papers (2022-06-14T10:16:54Z)
StRegA: Unsupervised Anomaly Detection in Brain MRIs using a Compact Context-encoding Variational Autoencoder [48.2010192865749]
Unsupervised anomaly detection (UAD) can learn a data distribution from an unlabelled dataset of healthy subjects and then be applied to detect out of distribution samples. This research proposes a compact version of the "context-encoding" VAE (ceVAE) model, combined with pre and post-processing steps, creating a UAD pipeline (StRegA) The proposed pipeline achieved a Dice score of 0.642$pm$0.101 while detecting tumours in T2w images of the BraTS dataset and 0.859$pm$0.112 while detecting artificially induced anomalies.
arXiv Detail & Related papers (2022-01-31T14:27:35Z)
SCREENet: A Multi-view Deep Convolutional Neural Network for Classification of High-resolution Synthetic Mammographic Screening Scans [3.8137985834223502]
We develop and evaluate a multi-view deep learning approach to the analysis of high-resolution synthetic mammograms. We assess the effect on accuracy of image resolution and training set size.
arXiv Detail & Related papers (2020-09-18T00:12:33Z)
Synthesizing lesions using contextual GANs improves breast cancer classification on mammograms [0.4297070083645048]
We present a novel generative adversarial network (GAN) model for data augmentation that can realistically synthesize and remove lesions on mammograms. With self-attention and semi-supervised learning components, the U-net-based architecture can generate high resolution (256x256px) outputs.
arXiv Detail & Related papers (2020-05-29T21:23:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.