CIMIL-CRC: a clinically-informed multiple instance learning framework for patient-level colorectal cancer molecular subtypes classification from H\&E stained images
- URL: http://arxiv.org/abs/2401.16131v2
- Date: Tue, 12 Nov 2024 07:55:34 GMT
- Title: CIMIL-CRC: a clinically-informed multiple instance learning framework for patient-level colorectal cancer molecular subtypes classification from H\&E stained images
- Authors: Hadar Hezi, Matan Gelber, Alexander Balabanov, Yosef E. Maruvka, Moti Freiman,
- Abstract summary: We introduce CIMIL-CRC', a framework that solves the MSI/MSS MIL problem by efficiently combining a pre-trained feature extraction model with principal component analysis (PCA) to aggregate information from all patches.
We assessed our CIMIL-CRC method using the average area under the curve (AUC) from a 5-fold cross-validation experimental setup for model development on the TCGA-CRC-DX cohort.
- Score: 42.771819949806655
- License:
- Abstract: Treatment approaches for colorectal cancer (CRC) are highly dependent on the molecular subtype, as immunotherapy has shown efficacy in cases with microsatellite instability (MSI) but is ineffective for the microsatellite stable (MSS) subtype. There is promising potential in utilizing deep neural networks (DNNs) to automate the differentiation of CRC subtypes by analyzing Hematoxylin and Eosin (H\&E) stained whole-slide images (WSIs). Due to the extensive size of WSIs, Multiple Instance Learning (MIL) techniques are typically explored. However, existing MIL methods focus on identifying the most representative image patches for classification, which may result in the loss of critical information. Additionally, these methods often overlook clinically relevant information, like the tendency for MSI class tumors to predominantly occur on the proximal (right side) colon. We introduce `CIMIL-CRC', a DNN framework that: 1) solves the MSI/MSS MIL problem by efficiently combining a pre-trained feature extraction model with principal component analysis (PCA) to aggregate information from all patches, and 2) integrates clinical priors, particularly the tumor location within the colon, into the model to enhance patient-level classification accuracy. We assessed our CIMIL-CRC method using the average area under the curve (AUC) from a 5-fold cross-validation experimental setup for model development on the TCGA-CRC-DX cohort, contrasting it with a baseline patch-level classification, MIL-only approach, and Clinically-informed patch-level classification approach. Our CIMIL-CRC outperformed all methods (AUROC: $0.92\pm0.002$ (95\% CI 0.91-0.92), vs. $0.79\pm0.02$ (95\% CI 0.76-0.82), $0.86\pm0.01$ (95\% CI 0.85-0.88), and $0.87\pm0.01$ (95\% CI 0.86-0.88), respectively). The improvement was statistically significant.
Related papers
- Multi-stage intermediate fusion for multimodal learning to classify non-small cell lung cancer subtypes from CT and PET [0.43498389175652047]
This study presents a multi-stage intermediate fusion approach to classify NSCLC subtypes from CT and PET images.
Our method integrates the two modalities at different stages of feature extraction, using voxel-wise fusion to exploit complementary information.
Our results demonstrate that the proposed method outperforms all alternatives across key metrics, with an accuracy and AUC equal to 0.724 and 0.681, respectively.
arXiv Detail & Related papers (2025-01-21T12:10:00Z) - Cancer-Net PCa-Seg: Benchmarking Deep Learning Models for Prostate Cancer Segmentation Using Synthetic Correlated Diffusion Imaging [65.83291923029985]
Prostate cancer (PCa) is the most prevalent cancer among men in the United States, accounting for nearly 300,000 cases, 29% of all diagnoses and 35,000 total deaths in 2024.
Traditional screening methods such as prostate-specific antigen (PSA) testing and magnetic resonance imaging (MRI) have been pivotal in diagnosis, but have faced limitations in specificity and generalizability.
We employ several state-of-the-art deep learning models, including U-Net, SegResNet, Swin UNETR, Attention U-Net, and LightM-UNet, to segment PCa lesions from a 200 CDI$
arXiv Detail & Related papers (2025-01-15T22:23:41Z) - Multi-modal Medical Image Fusion For Non-Small Cell Lung Cancer Classification [7.002657345547741]
Non-small cell lung cancer (NSCLC) is a predominant cause of cancer mortality worldwide.
In this paper, we introduce an innovative integration of multi-modal data, synthesizing fused medical imaging (CT and PET scans) with clinical health records and genomic data.
Our research surpasses existing approaches, as evidenced by a substantial enhancement in NSCLC detection and classification precision.
arXiv Detail & Related papers (2024-09-27T12:59:29Z) - Distilling High Diagnostic Value Patches for Whole Slide Image Classification Using Attention Mechanism [11.920941310806558]
Multiple Instance Learning (MIL) has garnered widespread attention in the field of Whole Slide Image (WSI) classification.
A drawback of bag-level MIL methods is the incorporation of more redundant patches, leading to interference.
We developed an attention-based feature distillation multi-instance learning (AFD-MIL) approach to extract patches with high diagnostic value.
arXiv Detail & Related papers (2024-07-29T09:14:21Z) - CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images [3.1118773046912382]
We propose the Context-Aware Multiple Instance Learning (CAMIL) architecture for cancer diagnosis.
CAMIL incorporates neighbor-constrained attention to consider dependencies among tiles within a Whole Slide Images (WSI) and integrates contextual constraints as prior knowledge.
We evaluate CAMIL on subtyping non-small cell lung cancer (TCGA-NSCLC) and detecting lymph node metastasis, achieving test AUCs of 97.5%, 95.9%, and 88.1%, respectively.
arXiv Detail & Related papers (2023-05-09T10:06:37Z) - Exploring the Interplay Between Colorectal Cancer Subtypes Genomic Variants and Cellular Morphology: A Deep-Learning Approach [4.077787659104316]
We trained CNN models for CRC subtype classification that account for potential correlation between genomic variations within CRC subtypes and their corresponding cellular morphology patterns.
We assessed the interplay between CRC subtypes' genomic variations and cellular morphology patterns by evaluating the CRC subtype classification accuracy of the different models.
arXiv Detail & Related papers (2023-03-26T12:13:29Z) - Attention-based Saliency Maps Improve Interpretability of Pneumothorax
Classification [52.77024349608834]
To investigate chest radiograph (CXR) classification performance of vision transformers (ViT) and interpretability of attention-based saliency.
ViTs were fine-tuned for lung disease classification using four public data sets: CheXpert, Chest X-Ray 14, MIMIC CXR, and VinBigData.
ViTs had comparable CXR classification AUCs compared with state-of-the-art CNNs.
arXiv Detail & Related papers (2023-03-03T12:05:41Z) - Improving Classification Model Performance on Chest X-Rays through Lung
Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations.
Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Multi-Scale Input Strategies for Medulloblastoma Tumor Classification
using Deep Transfer Learning [59.30734371401316]
Medulloblastoma is the most common malignant brain cancer among children.
CNN has shown promising results for MB subtype classification.
We study the impact of tile size and input strategy.
arXiv Detail & Related papers (2021-09-14T09:42:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.