HoloHisto: End-to-end Gigapixel WSI Segmentation with 4K Resolution Sequential Tokenization
- URL: http://arxiv.org/abs/2407.03307v1
- Date: Wed, 3 Jul 2024 17:49:31 GMT
- Title: HoloHisto: End-to-end Gigapixel WSI Segmentation with 4K Resolution Sequential Tokenization
- Authors: Yucheng Tang, Yufan He, Vishwesh Nath, Pengfeig Guo, Ruining Deng, Tianyuan Yao, Quan Liu, Can Cui, Mengmeng Yin, Ziyue Xu, Holger Roth, Daguang Xu, Haichun Yang, Yuankai Huo,
- Abstract summary: In digital pathology, the traditional method for deep learning-based image segmentation typically involves a two-stage process.
We propose the holistic histopathology (HoloHisto) segmentation method to achieve end-to-end segmentation on gigapixel WSIs.
Under the HoloHisto platform, we unveil a random 4K sampler that transcends ultra-high resolution, delivering 31 and 10 times more pixels than standard 2D and 3D patches.
- Score: 21.1691961979094
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In digital pathology, the traditional method for deep learning-based image segmentation typically involves a two-stage process: initially segmenting high-resolution whole slide images (WSI) into smaller patches (e.g., 256x256, 512x512, 1024x1024) and subsequently reconstructing them to their original scale. This method often struggles to capture the complex details and vast scope of WSIs. In this paper, we propose the holistic histopathology (HoloHisto) segmentation method to achieve end-to-end segmentation on gigapixel WSIs, whose maximum resolution is above 80,000$\times$70,000 pixels. HoloHisto fundamentally shifts the paradigm of WSI segmentation to an end-to-end learning fashion with 1) a large (4K) resolution base patch for elevated visual information inclusion and efficient processing, and 2) a novel sequential tokenization mechanism to properly model the contextual relationships and efficiently model the rich information from the 4K input. To our best knowledge, HoloHisto presents the first holistic approach for gigapixel resolution WSI segmentation, supporting direct I/O of complete WSI and their corresponding gigapixel masks. Under the HoloHisto platform, we unveil a random 4K sampler that transcends ultra-high resolution, delivering 31 and 10 times more pixels than standard 2D and 3D patches, respectively, for advancing computational capabilities. To facilitate efficient 4K resolution dense prediction, we leverage sequential tokenization, utilizing a pre-trained image tokenizer to group image features into a discrete token grid. To assess the performance, our team curated a new kidney pathology image segmentation (KPIs) dataset with WSI-level glomeruli segmentation from whole mouse kidneys. From the results, HoloHisto-4K delivers remarkable performance gains over previous state-of-the-art models.
Related papers
- xT: Nested Tokenization for Larger Context in Large Images [79.37673340393475]
xT is a framework for vision transformers which aggregates global context with local details.
We are able to increase accuracy by up to 8.6% on challenging classification tasks.
arXiv Detail & Related papers (2024-03-04T10:29:58Z) - ClusterFormer: Clustering As A Universal Visual Learner [80.79669078819562]
CLUSTERFORMER is a universal vision model based on the CLUSTERing paradigm with TransFORMER.
It is capable of tackling heterogeneous vision tasks with varying levels of clustering granularity.
For its efficacy, we hope our work can catalyze a paradigm shift in universal models in computer vision.
arXiv Detail & Related papers (2023-09-22T22:12:30Z) - Enhanced Sharp-GAN For Histopathology Image Synthesis [63.845552349914186]
Histopathology image synthesis aims to address the data shortage issue in training deep learning approaches for accurate cancer detection.
We propose a novel approach that enhances the quality of synthetic images by using nuclei topology and contour regularization.
The proposed approach outperforms Sharp-GAN in all four image quality metrics on two datasets.
arXiv Detail & Related papers (2023-01-24T17:54:01Z) - CUF: Continuous Upsampling Filters [25.584630142930123]
In this paper, we consider one of the most important operations in image processing: upsampling.
We propose to parameterize upsampling kernels as neural fields.
This parameterization leads to a compact architecture that obtains a 40-fold reduction in the number of parameters when compared with competing arbitrary-scale super-resolution architectures.
arXiv Detail & Related papers (2022-10-13T12:45:51Z) - One-shot Ultra-high-Resolution Generative Adversarial Network That
Synthesizes 16K Images On A Single GPU [1.9060575156739825]
OUR-GAN is a one-shot generative adversarial network framework that generates non-repetitive 16K images from a single training image.
OUR-GAN can synthesize high-quality 16K images with 12.5 GB of GPU memory and 4K images with only 4.29 GB.
OUR-GAN is the first one-shot image synthesizer that generates non-repetitive UHR images on a single consumer GPU.
arXiv Detail & Related papers (2022-02-28T13:48:41Z) - Superpixel Pre-Segmentation of HER2 Slides for Efficient Annotation [5.6032514243845455]
We use superpixel approaches to compute a pre-segmentation of HER2 stained images for breast cancer diagnosis.
These evaluations show encouraging first results for a pre-segmentation for efficient manual refinement.
arXiv Detail & Related papers (2022-01-19T12:51:33Z) - Dense Gaussian Processes for Few-Shot Segmentation [66.08463078545306]
We propose a few-shot segmentation method based on dense Gaussian process (GP) regression.
We exploit the end-to-end learning capabilities of our approach to learn a high-dimensional output space for the GP.
Our approach sets a new state-of-the-art for both 1-shot and 5-shot FSS on the PASCAL-5$i$ and COCO-20$i$ benchmarks.
arXiv Detail & Related papers (2021-10-07T17:57:54Z) - Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues.
We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z) - PSGR: Pixel-wise Sparse Graph Reasoning for COVID-19 Pneumonia
Segmentation in CT Images [83.26057031236965]
We propose a pixel-wise sparse graph reasoning (PSGR) module to enhance the modeling of long-range dependencies for COVID-19 infected region segmentation in CT images.
The PSGR module avoids imprecise pixel-to-node projections and preserves the inherent information of each pixel for global reasoning.
The solution has been evaluated against four widely-used segmentation models on three public datasets.
arXiv Detail & Related papers (2021-08-09T04:58:23Z) - An Efficient Cervical Whole Slide Image Analysis Framework Based on
Multi-scale Semantic and Spatial Features using Deep Learning [2.7218168309244652]
This study designs a novel inline connection network (InCNet) by enriching the multi-scale connectivity to build the lightweight model named You Only Look Cytopathology Once (YOLCO)
The proposed model allows the input size enlarged to megapixel that can stitch the WSI without any overlap by the average repeats.
Based on Transformer for classifying the integrated multi-scale multi-task features, the experimental results appear $0.872$ AUC score better and $2.51times$ faster than the best conventional method in WSI classification.
arXiv Detail & Related papers (2021-06-29T06:24:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.