Related papers: MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer

MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer

URL: http://arxiv.org/abs/2112.13513v1
Date: Mon, 27 Dec 2021 05:04:11 GMT
Title: MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer
Authors: Tianyi Zhang, Yunlu Feng, Yu Zhao, Guangda Fan, Aiming Yang, Shangqin Lyu, Peng Zhang, Fan Song, Chenbin Ma, Yangyang Sun, Youdan Feng, and Guanglei Zhang
Abstract summary: Pancreatic cancer is one of the most malignant cancers in the world, which deteriorates rapidly with very high mortality. We propose a hybrid high-performance deep learning model to enable the automated workflow. A dataset of 4240 ROSE images is collected to evaluate the method in this unexplored field.
Score: 5.604939010661757
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Pancreatic cancer is one of the most malignant cancers in the world, which deteriorates rapidly with very high mortality. The rapid on-site evaluation (ROSE) technique innovates the workflow by immediately analyzing the fast stained cytopathological images with on-site pathologists, which enables faster diagnosis in this time-pressured process. However, the wider expansion of ROSE diagnosis has been hindered by the lack of experienced pathologists. To overcome this problem, we propose a hybrid high-performance deep learning model to enable the automated workflow, thus freeing the occupation of the valuable time of pathologists. By firstly introducing the Transformer block into this field with our particular multi-stage hybrid design, the spatial features generated by the convolutional neural network (CNN) significantly enhance the Transformer global modeling. Turning multi-stage spatial features as global attention guidance, this design combines the robustness from the inductive bias of CNN with the sophisticated global modeling power of Transformer. A dataset of 4240 ROSE images is collected to evaluate the method in this unexplored field. The proposed multi-stage hybrid Transformer (MSHT) achieves 95.68% in classification accuracy, which is distinctively higher than the state-of-the-art models. Facing the need for interpretability, MSHT outperforms its counterparts with more accurate attention regions. The results demonstrate that the MSHT can distinguish cancer samples accurately at an unprecedented image scale, laying the foundation for deploying automatic decision systems and enabling the expansion of ROSE in clinical practice. The code and records are available at: https://github.com/sagizty/Multi-Stage-Hybrid-Transformer.

Related papers

AI Assisted Cervical Cancer Screening for Cytology Samples in Developing Countries [0.18472148461613155]
Cervical cancer remains a significant health challenge, with high incidence and mortality rates. Conventional Liquid-Based Cytology(LBC) is a labor-intensive process, requires expert pathologists and is highly prone to errors. This paper introduces an innovative approach that integrates low-cost biological microscopes with our simple and efficient AI algorithms for automated whole-slide analysis.
arXiv Detail & Related papers (2025-04-29T05:18:59Z)
Towards Accurate and Interpretable Neuroblastoma Diagnosis via Contrastive Multi-scale Pathological Image Analysis [16.268045905735818]
We propose CMSwinKAN, a contrastive-learning-based multi-scale feature fusion model tailored for pathological image classification.<n>By fusing multi-scale features and leveraging contrastive learning strategies, CMSwinKAN mimics clinicians' comprehensive approach.<n>Results demonstrate that CMSwinKAN performs better than existing state-of-the-art pathology-specific models pre-trained on large datasets.
arXiv Detail & Related papers (2025-04-18T15:39:46Z)
A Retrospective Systematic Study on Hierarchical Sparse Query Transformer-assisted Ultrasound Screening for Early Hepatocellular Carcinoma [10.226976909997711]
HCC is the third leading cause of cancer-related mortality worldwide. Recent advancements in AI technology offer promising solutions to bridge this gap. HSQformer is a novel hybrid architecture that synergizes CNNs' local feature extraction with Vision Transformers' global contextual awareness.
arXiv Detail & Related papers (2025-02-06T04:17:02Z)
A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
Deep neural networks have shown great potential for reconstructing high-fidelity images from undersampled measurements. Our model is based on neural operators, a discretization-agnostic architecture. Our inference speed is also 1,400x faster than diffusion methods.
arXiv Detail & Related papers (2024-10-05T20:03:57Z)
Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI [58.809276442508256]
We propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network superior performance than the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-11T15:46:00Z)
Benchmarking Image Transformers for Prostate Cancer Detection from Ultrasound Data [3.8208601340697386]
Deep learning methods for classifying prostate cancer (PCa) in ultrasound images typically employ convolutional networks (CNNs) to detect cancer in small regions of interest (ROI) along a needle trace region. Multi-scale approaches have sought to mitigate this issue by combining the awareness of transformers with a CNN feature extractor to detect cancer from multiple ROIs using multiple-instance learning (MIL) We present a study of several image transformer architectures for both ROI-scale and multi-scale classification, and a comparison of the performance of CNNs and transformers for ultrasound-based prostate cancer classification.
arXiv Detail & Related papers (2024-03-27T03:39:57Z)
Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions. We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training. Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z)
Breast Ultrasound Tumor Classification Using a Hybrid Multitask CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification. Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations. In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z)
On Sensitivity and Robustness of Normalization Schemes to Input Distribution Shifts in Automatic MR Image Diagnosis [58.634791552376235]
Deep Learning (DL) models have achieved state-of-the-art performance in diagnosing multiple diseases using reconstructed images as input. DL models are sensitive to varying artifacts as it leads to changes in the input data distribution between the training and testing phases. We propose to use other normalization techniques, such as Group Normalization and Layer Normalization, to inject robustness into model performance against varying image artifacts.
arXiv Detail & Related papers (2023-06-23T03:09:03Z)
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification [4.471084427623774]
We propose a robust yet efficient CNN-Transformer hybrid model which is equipped with the locality of CNNs and the global connectivity of vision Transformers. Our proposed hybrid model demonstrates its high robustness and generalization ability compared to the state-of-the-art studies on a large-scale collection of standardized MedMNIST-2D datasets.
arXiv Detail & Related papers (2023-02-19T02:55:45Z)
Hierarchical Transformer for Survival Prediction Using Multimodality Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical. This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes. Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z)
Shuffle Instances-based Vision Transformer for Pancreatic Cancer ROSE Image Classification [5.960465634030524]
The rapid on-site evaluation (ROSE) technique can accelerate the diagnosis of pancreatic cancer. The cancerous patterns vary significantly between different samples, making the computer diagnosis task extremely challenging. We propose a shuffle instances-based Vision Transformer (SI-ViT) approach, which can reduce the perturbations and enhance the modeling.
arXiv Detail & Related papers (2022-08-14T11:37:04Z)
Implementation of Convolutional Neural Network Architecture on 3D Multiparametric Magnetic Resonance Imaging for Prostate Cancer Diagnosis [0.0]
We propose a novel deep learning approach for automatic classification of prostate lesions in magnetic resonance images. Our framework achieved the classification performance with the area under a Receiver Operating Characteristic curve value of 0.87. Our proposed framework reflects the potential of assisting medical image interpretation in prostate cancer and reducing unnecessary biopsies.
arXiv Detail & Related papers (2021-12-29T16:47:52Z)
Hybrid guiding: A multi-resolution refinement approach for semantic segmentation of gigapixel histopathological images [0.7490318169877296]
We propose a cascaded convolutional neural network design, called H2G-Net, for semantic segmentation. Design involves a detection stage using a patch-wise method, and a refinement stage using a convolutional autoencoder. Best design achieved a Dice score of 0.933 on an independent test set of 90 WSIs.
arXiv Detail & Related papers (2021-12-07T02:31:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.