Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology
- URL: http://arxiv.org/abs/2501.17822v1
- Date: Wed, 29 Jan 2025 18:14:51 GMT
- Title: Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology
- Authors: Sobhan Hemati, Ghazal Alabtah, Saghir Alfasly, H. R. Tizhoosh,
- Abstract summary: A crucial step to efficiently integrate Whole Slide Images (WSIs) in computational pathology is assigning a single high-quality feature vector, i.e., one embedding, to each WSI.
In this paper, we evaluate the WSI search performance of multiple recently developed aggregation techniques.
- Score: 2.0088541799100392
- License:
- Abstract: A crucial step to efficiently integrate Whole Slide Images (WSIs) in computational pathology is assigning a single high-quality feature vector, i.e., one embedding, to each WSI. With the existence of many pre-trained deep neural networks and the emergence of foundation models, extracting embeddings for sub-images (i.e., tiles or patches) is straightforward. However, for WSIs, given their high resolution and gigapixel nature, inputting them into existing GPUs as a single image is not feasible. As a result, WSIs are usually split into many patches. Feeding each patch to a pre-trained model, each WSI can then be represented by a set of patches, hence, a set of embeddings. Hence, in such a setup, WSI representation learning reduces to set representation learning where for each WSI we have access to a set of patch embeddings. To obtain a single embedding from a set of patch embeddings for each WSI, multiple set-based learning schemes have been proposed in the literature. In this paper, we evaluate the WSI search performance of multiple recently developed aggregation techniques (mainly set representation learning techniques) including simple average or max pooling operations, Deep Sets, Memory networks, Focal attention, Gaussian Mixture Model (GMM) Fisher Vector, and deep sparse and binary Fisher Vector on four different primary sites including bladder, breast, kidney, and Colon from TCGA. Further, we benchmark the search performance of these methods against the median of minimum distances of patch embeddings, a non-aggregating approach used for WSI retrieval.
Related papers
- Scalable Whole Slide Image Representation Using K-Mean Clustering and Fisher Vector Aggregation [2.822194296769473]
Whole slide images (WSIs) are high-resolution, giga sized images that pose significant computational challenges.
We present a scalable and efficient methodology for WSI classification by leveraging patch-based feature extraction, clustering, and Fisher encoding.
Our method captures local and global tissue structures and yields robust performance for large-scale WSI classification.
arXiv Detail & Related papers (2025-01-21T12:22:15Z) - A self-supervised framework for learning whole slide representations [52.774822784847565]
We present Slide Pre-trained Transformers (SPT) for gigapixel-scale self-supervision of whole slide images.
We benchmark SPT visual representations on five diagnostic tasks across three biomedical microscopy datasets.
arXiv Detail & Related papers (2024-02-09T05:05:28Z) - BROW: Better featuRes fOr Whole slide image based on self-distillation [19.295596638166536]
Whole slide image (WSI) processing is becoming part of the key components of standard clinical diagnosis for various diseases.
The performance of most WSI-related tasks relies on the efficacy of the backbone which extracts WSI patch feature representations.
We proposed BROW, a foundation model for extracting better feature representations for WSIs, which can be conveniently adapted to downstream tasks without or with slight fine-tuning.
arXiv Detail & Related papers (2023-09-15T09:11:09Z) - ProtoDiv: Prototype-guided Division of Consistent Pseudo-bags for
Whole-slide Image Classification [5.836559246348487]
Pseudo-bag dividing scheme, often crucial for classification performance, is still an open topic worth exploring.
This paper proposes a novel scheme, ProtoDiv, using a bag prototype to guide the division of WSI pseudo-bags.
arXiv Detail & Related papers (2023-04-13T16:27:08Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Learning Binary and Sparse Permutation-Invariant Representations for
Fast and Memory Efficient Whole Slide Image Search [3.2580463372881234]
We propose a novel framework for learning binary and sparse WSI representations utilizing a deep generative modelling and the Fisher Vector.
We introduce new loss functions for learning sparse and binary permutation-invariant WSI representations that employ instance-based training.
The proposed method outperforms Yottixel (a recent search engine for histopathology images) both in terms of retrieval accuracy and speed.
arXiv Detail & Related papers (2022-08-29T14:56:36Z) - Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction [138.04956118993934]
We propose a novel Transformer-based method, coarse-to-fine sparse Transformer (CST)
CST embedding HSI sparsity into deep learning for HSI reconstruction.
In particular, CST uses our proposed spectra-aware screening mechanism (SASM) for coarse patch selecting. Then the selected patches are fed into our customized spectra-aggregation hashing multi-head self-attention (SAH-MSA) for fine pixel clustering and self-similarity capturing.
arXiv Detail & Related papers (2022-03-09T16:17:47Z) - Learning Prototype-oriented Set Representations for Meta-Learning [85.19407183975802]
Learning from set-structured data is a fundamental problem that has recently attracted increasing attention.
This paper provides a novel optimal transport based way to improve existing summary networks.
We further instantiate it to the cases of few-shot classification and implicit meta generative modeling.
arXiv Detail & Related papers (2021-10-18T09:49:05Z) - DSNet: A Dual-Stream Framework for Weakly-Supervised Gigapixel Pathology
Image Analysis [78.78181964748144]
We present a novel weakly-supervised framework for classifying whole slide images (WSIs)
WSIs are commonly processed by patch-wise classification with patch-level labels.
With image-level labels only, patch-wise classification would be sub-optimal due to inconsistency between the patch appearance and image-level label.
arXiv Detail & Related papers (2021-09-13T09:10:43Z) - Pay Attention with Focus: A Novel Learning Scheme for Classification of
Whole Slide Images [8.416553728391309]
We propose a novel two-stage approach to analyze whole slide images (WSIs)
First, we extract a set of representative patches (called mosaic) from a WSI.
Each patch of a mosaic is encoded to a feature vector using a deep network.
In the second stage, a set of encoded patch-level features from a WSI is used to compute the primary diagnosis probability.
arXiv Detail & Related papers (2021-06-11T21:59:02Z) - CARAFE++: Unified Content-Aware ReAssembly of FEatures [132.49582482421246]
We propose unified Content-Aware ReAssembly of FEatures (CARAFE++), a universal, lightweight and highly effective operator to fulfill this goal.
CARAFE++ generates adaptive kernels on-the-fly to enable instance-specific content-aware handling.
It shows consistent and substantial gains across all the tasks with negligible computational overhead.
arXiv Detail & Related papers (2020-12-07T07:34:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.