Towards Hierarchical Regional Transformer-based Multiple Instance
Learning
- URL: http://arxiv.org/abs/2308.12634v2
- Date: Mon, 20 Nov 2023 10:06:03 GMT
- Title: Towards Hierarchical Regional Transformer-based Multiple Instance
Learning
- Authors: Josef Cersovsky, Sadegh Mohammadi, Dagmar Kainmueller and Johannes
Hoehne
- Abstract summary: We propose a Transformer-based multiple instance learning approach that replaces the traditional learned attention mechanism with a regional, Vision Transformer inspired self-attention mechanism.
We present a method that fuses regional patch information to derive slide-level predictions and show how this regional aggregation can be stacked to hierarchically process features on different distance levels.
Our approach is able to significantly improve performance over the baseline on two histopathology datasets and points towards promising directions for further research.
- Score: 2.16656895298847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The classification of gigapixel histopathology images with deep multiple
instance learning models has become a critical task in digital pathology and
precision medicine. In this work, we propose a Transformer-based multiple
instance learning approach that replaces the traditional learned attention
mechanism with a regional, Vision Transformer inspired self-attention
mechanism. We present a method that fuses regional patch information to derive
slide-level predictions and show how this regional aggregation can be stacked
to hierarchically process features on different distance levels. To increase
predictive accuracy, especially for datasets with small, local morphological
features, we introduce a method to focus the image processing on high attention
regions during inference. Our approach is able to significantly improve
performance over the baseline on two histopathology datasets and points towards
promising directions for further research.
Related papers
- CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation [60.08541107831459]
This paper proposes a CNN-Transformer rectified collaborative learning framework to learn stronger CNN-based and Transformer-based models for medical image segmentation.
Specifically, we propose a rectified logit-wise collaborative learning (RLCL) strategy which introduces the ground truth to adaptively select and rectify the wrong regions in student soft labels.
We also propose a class-aware feature-wise collaborative learning (CFCL) strategy to achieve effective knowledge transfer between CNN-based and Transformer-based models in the feature space.
arXiv Detail & Related papers (2024-08-25T01:27:35Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical
Image Segmentation [0.0]
We propose a simple yet effective UNet-Transformer (seUNet-Trans) model for medical image segmentation.
In our approach, the UNet model is designed as a feature extractor to generate multiple feature maps from the input images.
By leveraging the UNet architecture and the self-attention mechanism, our model not only retains the preservation of both local and global context information but also is capable of capturing long-range dependencies between input elements.
arXiv Detail & Related papers (2023-10-16T01:13:38Z) - Analyzing the Domain Shift Immunity of Deep Homography Estimation [1.4607247979144045]
CNN-driven homography estimation models show a distinctive immunity to domain shifts.
This study explores the resilience of a variety of deep homography estimation models to domain shifts.
arXiv Detail & Related papers (2023-04-19T21:28:31Z) - Unsupervised Domain Transfer with Conditional Invertible Neural Networks [83.90291882730925]
We propose a domain transfer approach based on conditional invertible neural networks (cINNs)
Our method inherently guarantees cycle consistency through its invertible architecture, and network training can efficiently be conducted with maximum likelihood.
Our method enables the generation of realistic spectral data and outperforms the state of the art on two downstream classification tasks.
arXiv Detail & Related papers (2023-03-17T18:00:27Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - AbHE: All Attention-based Homography Estimation [0.0]
We propose a strong-baseline model based on the Swin Transformer, which combines convolution neural network for local features and transformer module for global features.
In the homography regression stage, we adopt an attention layer for the channels of correlation volume, which can drop out some weak correlation feature points.
The experiment shows that in 8 Degree-of-Freedoms(DOFs) homography estimation our method overperforms the state-of-the-art method.
arXiv Detail & Related papers (2022-12-06T15:00:00Z) - ScoreNet: Learning Non-Uniform Attention and Augmentation for
Transformer-Based Histopathological Image Classification [11.680355561258427]
High-resolution images hinder progress in digital pathology.
patch-based processing often incorporates multiple instance learning (MIL) to aggregate local patch-level representations yielding image-level prediction.
This paper proposes a transformer-based architecture specifically tailored for histological image classification.
It combines fine-grained local attention with a coarse global attention mechanism to learn meaningful representations of high-resolution images at an efficient computational cost.
arXiv Detail & Related papers (2022-02-15T16:55:09Z) - Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
Our approach performs contrastive learning by directly sampling individual point pairs from different regions.
Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z) - Unsupervised Bidirectional Cross-Modality Adaptation via Deeply
Synergistic Image and Feature Alignment for Medical Image Segmentation [73.84166499988443]
We present a novel unsupervised domain adaptation framework, named as Synergistic Image and Feature Alignment (SIFA)
Our proposed SIFA conducts synergistic alignment of domains from both image and feature perspectives.
Experimental results on two different tasks demonstrate that our SIFA method is effective in improving segmentation performance on unlabeled target images.
arXiv Detail & Related papers (2020-02-06T13:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.