Cross-scale Multi-instance Learning for Pathological Image Diagnosis
- URL: http://arxiv.org/abs/2304.00216v3
- Date: Fri, 16 Feb 2024 06:08:28 GMT
- Title: Cross-scale Multi-instance Learning for Pathological Image Diagnosis
- Authors: Ruining Deng, Can Cui, Lucas W. Remedios, Shunxing Bao, R. Michael
Womick, Sophie Chiron, Jia Li, Joseph T. Roland, Ken S. Lau, Qi Liu, Keith T.
Wilson, Yaohong Wang, Lori A. Coburn, Bennett A. Landman, Yuankai Huo
- Abstract summary: Multi-instance learning (MIL) is a common solution for working with high resolution images by classifying bags of objects.
We propose a novel cross-scale MIL algorithm to explicitly aggregate inter-scale relationships into a single MIL network for pathological image diagnosis.
- Score: 20.519711186151635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analyzing high resolution whole slide images (WSIs) with regard to
information across multiple scales poses a significant challenge in digital
pathology. Multi-instance learning (MIL) is a common solution for working with
high resolution images by classifying bags of objects (i.e. sets of smaller
image patches). However, such processing is typically performed at a single
scale (e.g., 20x magnification) of WSIs, disregarding the vital inter-scale
information that is key to diagnoses by human pathologists. In this study, we
propose a novel cross-scale MIL algorithm to explicitly aggregate inter-scale
relationships into a single MIL network for pathological image diagnosis. The
contribution of this paper is three-fold: (1) A novel cross-scale MIL (CS-MIL)
algorithm that integrates the multi-scale information and the inter-scale
relationships is proposed; (2) A toy dataset with scale-specific morphological
features is created and released to examine and visualize differential
cross-scale attention; (3) Superior performance on both in-house and public
datasets is demonstrated by our simple cross-scale MIL strategy. The official
implementation is publicly available at https://github.com/hrlblab/CS-MIL.
Related papers
- MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation [3.64388407705261]
We propose a Multi-Scale Vision Mamba UNet model for medical image segmentation, termed MSVM-UNet.
Specifically, by introducing multi-scale convolutions in the VSS blocks, we can more effectively capture and aggregate multi-scale feature representations from the hierarchical features of the VMamba encoder.
arXiv Detail & Related papers (2024-08-25T06:20:28Z) - Dual-scale Enhanced and Cross-generative Consistency Learning for Semi-supervised Medical Image Segmentation [49.57907601086494]
Medical image segmentation plays a crucial role in computer-aided diagnosis.
We propose a novel Dual-scale Enhanced and Cross-generative consistency learning framework for semi-supervised medical image (DEC-Seg)
arXiv Detail & Related papers (2023-12-26T12:56:31Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - M$^{2}$SNet: Multi-scale in Multi-scale Subtraction Network for Medical
Image Segmentation [73.10707675345253]
We propose a general multi-scale in multi-scale subtraction network (M$2$SNet) to finish diverse segmentation from medical image.
Our method performs favorably against most state-of-the-art methods under different evaluation metrics on eleven datasets of four different medical image segmentation tasks.
arXiv Detail & Related papers (2023-03-20T06:26:49Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Histopathological Image Classification based on Self-Supervised Vision
Transformer and Weak Labels [16.865729758055448]
We propose Self-ViT-MIL, a novel approach for classifying and localizing cancerous areas based on slide-level annotations.
Self-ViT-MIL surpasses existing state-of-the-art MIL-based approaches in terms of accuracy and area under the curve.
arXiv Detail & Related papers (2022-10-17T12:43:41Z) - Cross-scale Attention Guided Multi-instance Learning for Crohn's Disease
Diagnosis with Pathological Images [22.98849180654734]
Multi-instance learning (MIL) is widely used in the computer-aided interpretation of pathological Whole Slide Images (WSIs)
We propose a novel cross-scale attention mechanism to explicitly aggregate inter-scale interactions into a single MIL network for Crohn's Disease (CD)
Our approach achieved a superior Area under the Curve (AUC) score of 0.8924 compared with baseline models.
arXiv Detail & Related papers (2022-08-15T16:39:34Z) - Differentiable Zooming for Multiple Instance Learning on Whole-Slide
Images [4.928363812223965]
We propose ZoomMIL, a method that learns to perform multi-level zooming in an end-to-end manner.
The proposed method outperforms the state-of-the-art MIL methods in WSI classification on two large datasets.
arXiv Detail & Related papers (2022-04-26T17:20:50Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - MICDIR: Multi-scale Inverse-consistent Deformable Image Registration
using UNetMSS with Self-Constructing Graph Latent [0.0]
This paper extends the Voxelmorph approach in three different ways.
To improve the performance in case of small as well as large deformations, supervision of the model at different resolutions has been integrated using a multi-scale UNet.
On the task of registration of brain MRIs, the proposed method achieved significant improvements over ANTs and VoxelMorph.
arXiv Detail & Related papers (2022-03-08T18:07:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.