BROW: Better featuRes fOr Whole slide image based on self-distillation
- URL: http://arxiv.org/abs/2309.08259v1
- Date: Fri, 15 Sep 2023 09:11:09 GMT
- Title: BROW: Better featuRes fOr Whole slide image based on self-distillation
- Authors: Yuanfeng Wu, Shaojie Li, Zhiqiang Du, Wentao Zhu
- Abstract summary: Whole slide image (WSI) processing is becoming part of the key components of standard clinical diagnosis for various diseases.
The performance of most WSI-related tasks relies on the efficacy of the backbone which extracts WSI patch feature representations.
We proposed BROW, a foundation model for extracting better feature representations for WSIs, which can be conveniently adapted to downstream tasks without or with slight fine-tuning.
- Score: 19.295596638166536
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Whole slide image (WSI) processing is becoming part of the key components of
standard clinical diagnosis for various diseases. However, the direct
application of conventional image processing algorithms to WSI faces certain
obstacles because of WSIs' distinct property: the super-high resolution. The
performance of most WSI-related tasks relies on the efficacy of the backbone
which extracts WSI patch feature representations. Hence, we proposed BROW, a
foundation model for extracting better feature representations for WSIs, which
can be conveniently adapted to downstream tasks without or with slight
fine-tuning. The model takes transformer architecture, pretrained using
self-distillation framework. To improve model's robustness, techniques such as
patch shuffling have been employed. Additionally, the model leverages the
unique properties of WSIs, utilizing WSI's multi-scale pyramid to incorporate
an additional global view, thereby further enhancing its performance. We used
both private and public data to make up a large pretraining dataset, containing
more than 11000 slides, over 180M extracted patches, encompassing WSIs related
to various organs and tissues. To assess the effectiveness of \ourmodel, we run
a wide range of downstream tasks, including slide-level subtyping, patch-level
classification and nuclei instance segmentation. The results confirmed the
efficacy, robustness and good generalization ability of the proposed model.
This substantiates its potential as foundation model for WSI feature extraction
and highlights promising prospects for its application in WSI processing.
Related papers
- EXAONEPath 1.0 Patch-level Foundation Model for Pathology [12.179645627327428]
Features extracted from self-supervised models tend to cluster by individual whole slide images (WSIs)
We introduce EXAONEPath, a novel foundational model trained on patches that have undergone stain normalization.
We show that EXAONEPath achieves superior performance relative to the number of WSIs used and the model's parameter count.
arXiv Detail & Related papers (2024-08-01T08:41:13Z) - FALFormer: Feature-aware Landmarks self-attention for Whole-slide Image Classification [4.064178811354613]
Slide-level classification for whole-slide images (WSIs) has been widely recognized as a crucial problem in digital and computational pathology.
We propose an efficient and effective slide-level classification model, named as FALFormer, that can process a WSI as a whole.
arXiv Detail & Related papers (2024-07-10T03:24:40Z) - TSI-Bench: Benchmarking Time Series Imputation [52.27004336123575]
TSI-Bench is a comprehensive benchmark suite for time series imputation utilizing deep learning techniques.
The TSI-Bench pipeline standardizes experimental settings to enable fair evaluation of imputation algorithms.
TSI-Bench innovatively provides a systematic paradigm to tailor time series forecasting algorithms for imputation purposes.
arXiv Detail & Related papers (2024-06-18T16:07:33Z) - MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models [56.37780601189795]
We propose a framework named MamMIL for WSI analysis.
We represent each WSI as an undirected graph.
To address the problem that Mamba can only process 1D sequences, we propose a topology-aware scanning mechanism.
arXiv Detail & Related papers (2024-03-08T09:02:13Z) - A self-supervised framework for learning whole slide representations [52.774822784847565]
We present Slide Pre-trained Transformers (SPT) for gigapixel-scale self-supervision of whole slide images.
We benchmark SPT visual representations on five diagnostic tasks across three biomedical microscopy datasets.
arXiv Detail & Related papers (2024-02-09T05:05:28Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - ConSlide: Asynchronous Hierarchical Interaction Transformer with
Breakup-Reorganize Rehearsal for Continual Whole Slide Image Analysis [24.078490055421852]
Whole slide image (WSI) analysis has become increasingly important in the medical imaging community.
In this paper, we propose the FIRST continual learning framework for WSI analysis, named ConSlide.
arXiv Detail & Related papers (2023-08-25T11:58:25Z) - Task-specific Fine-tuning via Variational Information Bottleneck for
Weakly-supervised Pathology Whole Slide Image Classification [10.243293283318415]
Multiple Instance Learning (MIL) has shown promising results in digital Pathology Whole Slide Image (WSI) classification.
We propose an efficient WSI fine-tuning framework motivated by the Information Bottleneck theory.
Our framework is evaluated on five pathology WSI datasets on various WSI heads.
arXiv Detail & Related papers (2023-03-15T08:41:57Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Pay Attention with Focus: A Novel Learning Scheme for Classification of
Whole Slide Images [8.416553728391309]
We propose a novel two-stage approach to analyze whole slide images (WSIs)
First, we extract a set of representative patches (called mosaic) from a WSI.
Each patch of a mosaic is encoded to a feature vector using a deep network.
In the second stage, a set of encoded patch-level features from a WSI is used to compute the primary diagnosis probability.
arXiv Detail & Related papers (2021-06-11T21:59:02Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.