Related papers: BROW: Better featuRes fOr Whole slide image based on self-distillation

BROW: Better featuRes fOr Whole slide image based on self-distillation

URL: http://arxiv.org/abs/2309.08259v1
Date: Fri, 15 Sep 2023 09:11:09 GMT
Title: BROW: Better featuRes fOr Whole slide image based on self-distillation
Authors: Yuanfeng Wu, Shaojie Li, Zhiqiang Du, Wentao Zhu
Abstract summary: Whole slide image (WSI) processing is becoming part of the key components of standard clinical diagnosis for various diseases. The performance of most WSI-related tasks relies on the efficacy of the backbone which extracts WSI patch feature representations. We proposed BROW, a foundation model for extracting better feature representations for WSIs, which can be conveniently adapted to downstream tasks without or with slight fine-tuning.
Score: 19.295596638166536
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Whole slide image (WSI) processing is becoming part of the key components of standard clinical diagnosis for various diseases. However, the direct application of conventional image processing algorithms to WSI faces certain obstacles because of WSIs' distinct property: the super-high resolution. The performance of most WSI-related tasks relies on the efficacy of the backbone which extracts WSI patch feature representations. Hence, we proposed BROW, a foundation model for extracting better feature representations for WSIs, which can be conveniently adapted to downstream tasks without or with slight fine-tuning. The model takes transformer architecture, pretrained using self-distillation framework. To improve model's robustness, techniques such as patch shuffling have been employed. Additionally, the model leverages the unique properties of WSIs, utilizing WSI's multi-scale pyramid to incorporate an additional global view, thereby further enhancing its performance. We used both private and public data to make up a large pretraining dataset, containing more than 11000 slides, over 180M extracted patches, encompassing WSIs related to various organs and tissues. To assess the effectiveness of \ourmodel, we run a wide range of downstream tasks, including slide-level subtyping, patch-level classification and nuclei instance segmentation. The results confirmed the efficacy, robustness and good generalization ability of the proposed model. This substantiates its potential as foundation model for WSI feature extraction and highlights promising prospects for its application in WSI processing.

Related papers

Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology [2.0088541799100392]
A crucial step to efficiently integrate Whole Slide Images (WSIs) in computational pathology is assigning a single high-quality feature vector, i.e., one embedding, to each WSI. In this paper, we evaluate the WSI search performance of multiple recently developed aggregation techniques.
arXiv Detail & Related papers (2025-01-29T18:14:51Z)
SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications. Current state-of-the-art methods focus on training innovative architectural designs on confined datasets. We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z)
Promptable Representation Distribution Learning and Data Augmentation for Gigapixel Histopathology WSI Analysis [7.823674912857107]
We propose a Promptable Representation Distribution Learning framework (PRDL) for both patch-level representation learning and WSI-level data augmentation. The proposed method stably outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-12-19T02:47:17Z)
EXAONEPath 1.0 Patch-level Foundation Model for Pathology [12.179645627327428]
Features extracted from self-supervised models tend to cluster by individual whole slide images (WSIs) We introduce EXAONEPath, a novel foundational model trained on patches that have undergone stain normalization. We show that EXAONEPath achieves superior performance relative to the number of WSIs used and the model's parameter count.
arXiv Detail & Related papers (2024-08-01T08:41:13Z)
FALFormer: Feature-aware Landmarks self-attention for Whole-slide Image Classification [4.064178811354613]
Slide-level classification for whole-slide images (WSIs) has been widely recognized as a crucial problem in digital and computational pathology. We propose an efficient and effective slide-level classification model, named as FALFormer, that can process a WSI as a whole.
arXiv Detail & Related papers (2024-07-10T03:24:40Z)
TSI-Bench: Benchmarking Time Series Imputation [52.27004336123575]
TSI-Bench is a comprehensive benchmark suite for time series imputation utilizing deep learning techniques. The TSI-Bench pipeline standardizes experimental settings to enable fair evaluation of imputation algorithms. TSI-Bench innovatively provides a systematic paradigm to tailor time series forecasting algorithms for imputation purposes.
arXiv Detail & Related papers (2024-06-18T16:07:33Z)
MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models [56.37780601189795]
We propose a framework named MamMIL for WSI analysis. We represent each WSI as an undirected graph. To address the problem that Mamba can only process 1D sequences, we propose a topology-aware scanning mechanism.
arXiv Detail & Related papers (2024-03-08T09:02:13Z)
A self-supervised framework for learning whole slide representations [52.774822784847565]
We present Slide Pre-trained Transformers (SPT) for gigapixel-scale self-supervision of whole slide images. We benchmark SPT visual representations on five diagnostic tasks across three biomedical microscopy datasets.
arXiv Detail & Related papers (2024-02-09T05:05:28Z)
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery. We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z)
ConSlide: Asynchronous Hierarchical Interaction Transformer with Breakup-Reorganize Rehearsal for Continual Whole Slide Image Analysis [24.078490055421852]
Whole slide image (WSI) analysis has become increasingly important in the medical imaging community. In this paper, we propose the FIRST continual learning framework for WSI analysis, named ConSlide.
arXiv Detail & Related papers (2023-08-25T11:58:25Z)
Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification [10.243293283318415]
Multiple Instance Learning (MIL) has shown promising results in digital Pathology Whole Slide Image (WSI) classification. We propose an efficient WSI fine-tuning framework motivated by the Information Bottleneck theory. Our framework is evaluated on five pathology WSI datasets on various WSI heads.
arXiv Detail & Related papers (2023-03-15T08:41:57Z)
Hierarchical Transformer for Survival Prediction Using Multimodality Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical. This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes. Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z)
Pay Attention with Focus: A Novel Learning Scheme for Classification of Whole Slide Images [8.416553728391309]
We propose a novel two-stage approach to analyze whole slide images (WSIs) First, we extract a set of representative patches (called mosaic) from a WSI. Each patch of a mosaic is encoded to a feature vector using a deep network. In the second stage, a set of encoded patch-level features from a WSI is used to compute the primary diagnosis probability.
arXiv Detail & Related papers (2021-06-11T21:59:02Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.