ConSlide: Asynchronous Hierarchical Interaction Transformer with
Breakup-Reorganize Rehearsal for Continual Whole Slide Image Analysis
- URL: http://arxiv.org/abs/2308.13324v1
- Date: Fri, 25 Aug 2023 11:58:25 GMT
- Title: ConSlide: Asynchronous Hierarchical Interaction Transformer with
Breakup-Reorganize Rehearsal for Continual Whole Slide Image Analysis
- Authors: Yanyan Huang, Weiqin Zhao, Shujun Wang, Yu Fu, Yuming Jiang, Lequan Yu
- Abstract summary: Whole slide image (WSI) analysis has become increasingly important in the medical imaging community.
In this paper, we propose the FIRST continual learning framework for WSI analysis, named ConSlide.
- Score: 24.078490055421852
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Whole slide image (WSI) analysis has become increasingly important in the
medical imaging community, enabling automated and objective diagnosis,
prognosis, and therapeutic-response prediction. However, in clinical practice,
the ever-evolving environment hamper the utility of WSI analysis models. In
this paper, we propose the FIRST continual learning framework for WSI analysis,
named ConSlide, to tackle the challenges of enormous image size, utilization of
hierarchical structure, and catastrophic forgetting by progressive model
updating on multiple sequential datasets. Our framework contains three key
components. The Hierarchical Interaction Transformer (HIT) is proposed to model
and utilize the hierarchical structural knowledge of WSI. The
Breakup-Reorganize (BuRo) rehearsal method is developed for WSI data replay
with efficient region storing buffer and WSI reorganizing operation. The
asynchronous updating mechanism is devised to encourage the network to learn
generic and specific knowledge respectively during the replay stage, based on a
nested cross-scale similarity learning (CSSL) module. We evaluated the proposed
ConSlide on four public WSI datasets from TCGA projects. It performs best over
other state-of-the-art methods with a fair WSI-based continual learning setting
and achieves a better trade-off of the overall performance and forgetting on
previous task
Related papers
- Meta-Exploiting Frequency Prior for Cross-Domain Few-Shot Learning [86.99944014645322]
We introduce a novel framework, Meta-Exploiting Frequency Prior for Cross-Domain Few-Shot Learning.
We decompose each query image into its high-frequency and low-frequency components, and parallel incorporate them into the feature embedding network.
Our framework establishes new state-of-the-art results on multiple cross-domain few-shot learning benchmarks.
arXiv Detail & Related papers (2024-11-03T04:02:35Z) - Dynamic Graph Representation with Knowledge-aware Attention for
Histopathology Whole Slide Image Analysis [11.353826466710398]
We propose a novel dynamic graph representation algorithm that conceptualizes WSIs as a form of the knowledge graph structure.
Specifically, we dynamically construct neighbors and directed edge embeddings based on the head and tail relationships between instances.
Our end-to-end graph representation learning approach has outperformed the state-of-the-art WSI analysis methods on three TCGA benchmark datasets and in-house test sets.
arXiv Detail & Related papers (2024-03-12T14:58:51Z) - A self-supervised framework for learning whole slide representations [52.774822784847565]
We present Slide Pre-trained Transformers (SPT) for gigapixel-scale self-supervision of whole slide images.
We benchmark SPT visual representations on five diagnostic tasks across three biomedical microscopy datasets.
arXiv Detail & Related papers (2024-02-09T05:05:28Z) - BROW: Better featuRes fOr Whole slide image based on self-distillation [19.295596638166536]
Whole slide image (WSI) processing is becoming part of the key components of standard clinical diagnosis for various diseases.
The performance of most WSI-related tasks relies on the efficacy of the backbone which extracts WSI patch feature representations.
We proposed BROW, a foundation model for extracting better feature representations for WSIs, which can be conveniently adapted to downstream tasks without or with slight fine-tuning.
arXiv Detail & Related papers (2023-09-15T09:11:09Z) - Revisiting the Encoding of Satellite Image Time Series [2.5874041837241304]
Image Time Series (SITS)temporal learning is complex due to hightemporal resolutions and irregular acquisition times.
We develop a novel perspective of SITS processing as a direct set prediction problem, inspired by the recent trend in adopting query-based transformer decoders.
We attain new state-of-the-art (SOTA) results on the Satellite PASTIS benchmark dataset.
arXiv Detail & Related papers (2023-05-03T12:44:20Z) - Task-specific Fine-tuning via Variational Information Bottleneck for
Weakly-supervised Pathology Whole Slide Image Classification [10.243293283318415]
Multiple Instance Learning (MIL) has shown promising results in digital Pathology Whole Slide Image (WSI) classification.
We propose an efficient WSI fine-tuning framework motivated by the Information Bottleneck theory.
Our framework is evaluated on five pathology WSI datasets on various WSI heads.
arXiv Detail & Related papers (2023-03-15T08:41:57Z) - Learning Binary and Sparse Permutation-Invariant Representations for
Fast and Memory Efficient Whole Slide Image Search [3.2580463372881234]
We propose a novel framework for learning binary and sparse WSI representations utilizing a deep generative modelling and the Fisher Vector.
We introduce new loss functions for learning sparse and binary permutation-invariant WSI representations that employ instance-based training.
The proposed method outperforms Yottixel (a recent search engine for histopathology images) both in terms of retrieval accuracy and speed.
arXiv Detail & Related papers (2022-08-29T14:56:36Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Auto-Panoptic: Cooperative Multi-Component Architecture Search for
Panoptic Segmentation [144.50154657257605]
We propose an efficient framework to simultaneously search for all main components including backbone, segmentation branches, and feature fusion module.
Our searched architecture, namely Auto-Panoptic, achieves the new state-of-the-art on the challenging COCO and ADE20K benchmarks.
arXiv Detail & Related papers (2020-10-30T08:34:35Z) - Meta-learning framework with applications to zero-shot time-series
forecasting [82.61728230984099]
This work provides positive evidence using a broad meta-learning framework.
residual connections act as a meta-learning adaptation mechanism.
We show that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining.
arXiv Detail & Related papers (2020-02-07T16:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.