Instance-based Vision Transformer for Subtyping of Papillary Renal Cell
Carcinoma in Histopathological Image
- URL: http://arxiv.org/abs/2106.12265v1
- Date: Wed, 23 Jun 2021 09:42:49 GMT
- Title: Instance-based Vision Transformer for Subtyping of Papillary Renal Cell
Carcinoma in Histopathological Image
- Authors: Zeyu Gao, Bangyang Hong, Xianli Zhang, Yang Li, Chang Jia, Jialun Wu,
Chunbao Wang, Deyu Meng, Chen Li
- Abstract summary: Histological subtype of papillary (p) renal cell carcinoma (RCC), type 1 vs. type 2, is an essential prognostic factor.
This paper proposes a novel instance-based Vision Transformer (i-ViT) to learn robust representations of histological images for the pRCC subtyping task.
Experimental results show that the proposed method achieves better performance than existing CNN-based models with a significant margin.
- Score: 31.00452985964065
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Histological subtype of papillary (p) renal cell carcinoma (RCC), type 1 vs.
type 2, is an essential prognostic factor. The two subtypes of pRCC have a
similar pattern, i.e., the papillary architecture, yet some subtle differences,
including cellular and cell-layer level patterns. However, the cellular and
cell-layer level patterns almost cannot be captured by existing CNN-based
models in large-size histopathological images, which brings obstacles to
directly applying these models to such a fine-grained classification task. This
paper proposes a novel instance-based Vision Transformer (i-ViT) to learn
robust representations of histopathological images for the pRCC subtyping task
by extracting finer features from instance patches (by cropping around
segmented nuclei and assigning predicted grades). The proposed i-ViT takes
top-K instances as input and aggregates them for capturing both the cellular
and cell-layer level patterns by a position-embedding layer, a grade-embedding
layer, and a multi-head multi-layer self-attention module. To evaluate the
performance of the proposed framework, experienced pathologists are invited to
selected 1162 regions of interest from 171 whole slide images of type 1 and
type 2 pRCC. Experimental results show that the proposed method achieves better
performance than existing CNN-based models with a significant margin.
Related papers
- Benchmarking Hierarchical Image Pyramid Transformer for the classification of colon biopsies and polyps in histopathology images [1.0007063839516088]
Recent advances in self-supervised learning have shown that highly descriptive image representations can be learned without the need for annotations.
We investigate the application of the recent Hierarchical Image Pyramid Transformer (HIPT) model for the specific task of classification of colorectal biopsies and polyps.
arXiv Detail & Related papers (2024-05-24T00:59:30Z) - Dual-channel Prototype Network for few-shot Classification of
Pathological Images [0.7562219957261347]
We introduce the Dual-channel Prototype Network (DCPN) to tackle the challenge of classifying pathological images with limited samples.
DCPN augments the Pyramid Vision Transformer framework for few-shot classification via self-supervised learning and integrates it with convolutional neural networks.
This combination forms a dual-channel architecture that extracts multi-scale, highly precise pathological features.
arXiv Detail & Related papers (2023-11-14T03:03:21Z) - Multi-stream Cell Segmentation with Low-level Cues for Multi-modality
Images [66.79688768141814]
We develop an automatic cell classification pipeline to label microscopy images.
We then train a classification model based on the category labels.
We deploy two types of segmentation models to segment cells with roundish and irregular shapes.
arXiv Detail & Related papers (2023-10-22T08:11:08Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - A Novel Vision Transformer with Residual in Self-attention for
Biomedical Image Classification [8.92307560991779]
This article presents the novel framework of multi-head self-attention for vision transformer (ViT)
The proposed method uses the concept of residual connection for accumulating the best attention output in each block of multi-head attention.
The results show the significant improvement over traditional ViT and other convolution based state-of-the-art classification models.
arXiv Detail & Related papers (2023-06-02T15:06:14Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Dual Attention Model with Reinforcement Learning for Classification of Histology Whole-Slide Images [8.404881822414898]
Digital whole slide images (WSIs) are generally captured at microscopic resolution and encompass extensive spatial data.
We propose a novel dual attention approach, consisting of two main components, both inspired by the visual examination process of a pathologist.
We show that the proposed model achieves performance better than or comparable to the state-of-the-art methods while processing less than 10% of the WSI at the highest magnification.
arXiv Detail & Related papers (2023-02-19T22:26:25Z) - Modeling Category-Selective Cortical Regions with Topographic
Variational Autoencoders [72.15087604017441]
Category-selectivity describes the observation that certain spatially localized areas of the cerebral cortex tend to respond robustly and selectively to stimuli from specific limited categories.
We leverage the newly introduced Topographic Variational Autoencoder to model of the emergence of such localized category-selectivity in an unsupervised manner.
We show preliminary results suggesting that our model yields a nested spatial hierarchy of increasingly abstract categories, analogous to observations from the human ventral temporal cortex.
arXiv Detail & Related papers (2021-10-25T11:37:41Z) - Multi-Scale Input Strategies for Medulloblastoma Tumor Classification
using Deep Transfer Learning [59.30734371401316]
Medulloblastoma is the most common malignant brain cancer among children.
CNN has shown promising results for MB subtype classification.
We study the impact of tile size and input strategy.
arXiv Detail & Related papers (2021-09-14T09:42:37Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.