An Efficient Cervical Whole Slide Image Analysis Framework Based on
Multi-scale Semantic and Spatial Features using Deep Learning
- URL: http://arxiv.org/abs/2106.15113v1
- Date: Tue, 29 Jun 2021 06:24:55 GMT
- Title: An Efficient Cervical Whole Slide Image Analysis Framework Based on
Multi-scale Semantic and Spatial Features using Deep Learning
- Authors: Ziquan Wei, Shenghua Cheng, Xiuli Liu, Shaoqun Zeng
- Abstract summary: This study designs a novel inline connection network (InCNet) by enriching the multi-scale connectivity to build the lightweight model named You Only Look Cytopathology Once (YOLCO)
The proposed model allows the input size enlarged to megapixel that can stitch the WSI without any overlap by the average repeats.
Based on Transformer for classifying the integrated multi-scale multi-task features, the experimental results appear $0.872$ AUC score better and $2.51times$ faster than the best conventional method in WSI classification.
- Score: 2.7218168309244652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Digital gigapixel whole slide image (WSI) is widely used in clinical
diagnosis, and automated WSI analysis is key for computer-aided diagnosis.
Currently, analyzing the integrated descriptor of probabilities or feature maps
from massive local patches encoded by ResNet classifier is the main manner for
WSI-level prediction. Feature representations of the sparse and tiny lesion
cells in cervical slides, however, are still challengeable for the
under-promoted upstream encoders, while the unused spatial representations of
cervical cells are the available features to supply the semantics analysis. As
well as patches sampling with overlap and repetitive processing incur the
inefficiency and the unpredictable side effect. This study designs a novel
inline connection network (InCNet) by enriching the multi-scale connectivity to
build the lightweight model named You Only Look Cytopathology Once (YOLCO) with
the additional supervision of spatial information. The proposed model allows
the input size enlarged to megapixel that can stitch the WSI without any
overlap by the average repeats decreased from $10^3\sim10^4$ to $10^1\sim10^2$
for collecting features and predictions at two scales. Based on Transformer for
classifying the integrated multi-scale multi-task features, the experimental
results appear $0.872$ AUC score better and $2.51\times$ faster than the best
conventional method in WSI classification on multicohort datasets of 2,019
slides from four scanning devices.
Related papers
- A self-supervised framework for learning whole slide representations [52.774822784847565]
We present Slide Pre-trained Transformers (SPT) for gigapixel-scale self-supervision of whole slide images.
We benchmark SPT visual representations on five diagnostic tasks across three biomedical microscopy datasets.
arXiv Detail & Related papers (2024-02-09T05:05:28Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - Evolutionary Computation in Action: Feature Selection for Deep Embedding
Spaces of Gigapixel Pathology Images [0.6037276428689636]
We introduce a new evolutionary approach for WSI representation based on large-scale multi-objective optimization (LSMOP) of deep embeddings.
We validate the proposed schemes using The Cancer Genome Atlas (TC) images in terms of WSI representation, classification accuracy, and feature quality.
The proposed evolutionary algorithm finds a very compact feature vector to represent a WSI with 8% higher accuracy compared to the codes provided by the state-of-the-art methods.
arXiv Detail & Related papers (2023-03-02T03:36:15Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Video-TransUNet: Temporally Blended Vision Transformer for CT VFSS
Instance Segmentation [11.575821326313607]
We propose Video-TransUNet, a deep architecture for segmentation in medical CT videos constructed by integrating temporal feature blending into the TransUNet deep learning framework.
In particular, our approach amalgamates strong frame representation via a ResNet CNN backbone, multi-frame feature blending via a Temporal Context Module, and reconstructive capabilities for multiple targets via a UNet-based convolutional-deconal architecture with multiple heads.
arXiv Detail & Related papers (2022-08-17T14:28:58Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - Pay Attention with Focus: A Novel Learning Scheme for Classification of
Whole Slide Images [8.416553728391309]
We propose a novel two-stage approach to analyze whole slide images (WSIs)
First, we extract a set of representative patches (called mosaic) from a WSI.
Each patch of a mosaic is encoded to a feature vector using a deep network.
In the second stage, a set of encoded patch-level features from a WSI is used to compute the primary diagnosis probability.
arXiv Detail & Related papers (2021-06-11T21:59:02Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.