Factorizer: A Scalable Interpretable Approach to Context Modeling for
Medical Image Segmentation
- URL: http://arxiv.org/abs/2202.12295v2
- Date: Mon, 28 Feb 2022 17:23:16 GMT
- Title: Factorizer: A Scalable Interpretable Approach to Context Modeling for
Medical Image Segmentation
- Authors: Pooya Ashtari, Diana Sima, Lieven De Lathauwer, Dominique
Sappey-Marinierd, Frederik Maes, and Sabine Van Huffel
- Abstract summary: This work introduces a family of models, dubbed Factorizer, which leverages the power of low-rank matrix factorization for constructing an end-to-end segmentation model.
Specifically, we propose a linearly scalable approach to context modeling, formulating Nonnegative Matrix Factorization (NMF) as a differentiable layer integrated into a U-shaped architecture.
Factorizers compete favorably with CNNs and Transformers in terms of accuracy, scalability, and interpretability.
- Score: 6.030648996110607
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional Neural Networks (CNNs) with U-shaped architectures have
dominated medical image segmentation, which is crucial for various clinical
purposes. However, the inherent locality of convolution makes CNNs fail to
fully exploit global context, essential for better recognition of some
structures, e.g., brain lesions. Transformers have recently proved promising
performance on vision tasks, including semantic segmentation, mainly due to
their capability of modeling long-range dependencies. Nevertheless, the
quadratic complexity of attention makes existing Transformer-based models use
self-attention layers only after somehow reducing the image resolution, which
limits the ability to capture global contexts present at higher resolutions.
Therefore, this work introduces a family of models, dubbed Factorizer, which
leverages the power of low-rank matrix factorization for constructing an
end-to-end segmentation model. Specifically, we propose a linearly scalable
approach to context modeling, formulating Nonnegative Matrix Factorization
(NMF) as a differentiable layer integrated into a U-shaped architecture. The
shifted window technique is also utilized in combination with NMF to
effectively aggregate local information. Factorizers compete favorably with
CNNs and Transformers in terms of accuracy, scalability, and interpretability,
achieving state-of-the-art results on the BraTS dataset for brain tumor
segmentation, with Dice scores of 79.33%, 83.14%, and 90.16% for enhancing
tumor, tumor core, and whole tumor, respectively. Highly meaningful NMF
components give an additional interpretability advantage to Factorizers over
CNNs and Transformers. Moreover, our ablation studies reveal a distinctive
feature of Factorizers that enables a significant speed-up in inference for a
trained Factorizer without any extra steps and without sacrificing much
accuracy.
Related papers
- MambaClinix: Hierarchical Gated Convolution and Mamba-Based U-Net for Enhanced 3D Medical Image Segmentation [6.673169053236727]
We propose MambaClinix, a novel U-shaped architecture for medical image segmentation.
MambaClinix integrates a hierarchical gated convolutional network with Mamba in an adaptive stage-wise framework.
Our results show that MambaClinix achieves high segmentation accuracy while maintaining low model complexity.
arXiv Detail & Related papers (2024-09-19T07:51:14Z) - CNN-Transformer Rectified Collaborative Learning for Medical Image Segmentation [60.08541107831459]
This paper proposes a CNN-Transformer rectified collaborative learning framework to learn stronger CNN-based and Transformer-based models for medical image segmentation.
Specifically, we propose a rectified logit-wise collaborative learning (RLCL) strategy which introduces the ground truth to adaptively select and rectify the wrong regions in student soft labels.
We also propose a class-aware feature-wise collaborative learning (CFCL) strategy to achieve effective knowledge transfer between CNN-based and Transformer-based models in the feature space.
arXiv Detail & Related papers (2024-08-25T01:27:35Z) - CSWin-UNet: Transformer UNet with Cross-Shaped Windows for Medical Image Segmentation [22.645013853519]
CSWin-UNet is a novel U-shaped segmentation method that incorporates the CSWin self-attention mechanism into the UNet.
Our empirical evaluations on diverse datasets, including synapse multi-organ CT, cardiac MRI, and skin lesions, demonstrate that CSWin-UNet maintains low model complexity while delivering high segmentation accuracy.
arXiv Detail & Related papers (2024-07-25T14:25:17Z) - Flattening Singular Values of Factorized Convolution for Medical Images [2.41019965808244]
Convolutional neural networks (CNNs) have long been the paradigm of choice for robust medical image processing (MIP)
Many methods employ factorized convolutional layers to alleviate the burden of limited computational resources.
We propose a Singular value equalization generalizer-induced Factorized Convolution (SFConv) to improve the expressive power of factorized convolutions in MIP models.
arXiv Detail & Related papers (2024-03-01T15:30:50Z) - SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical
Image Segmentation [0.0]
We propose a simple yet effective UNet-Transformer (seUNet-Trans) model for medical image segmentation.
In our approach, the UNet model is designed as a feature extractor to generate multiple feature maps from the input images.
By leveraging the UNet architecture and the self-attention mechanism, our model not only retains the preservation of both local and global context information but also is capable of capturing long-range dependencies between input elements.
arXiv Detail & Related papers (2023-10-16T01:13:38Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Cross-receptive Focused Inference Network for Lightweight Image
Super-Resolution [64.25751738088015]
Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks.
Transformers that need to incorporate contextual information to extract features dynamically are neglected.
We propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer.
arXiv Detail & Related papers (2022-07-06T16:32:29Z) - MISSU: 3D Medical Image Segmentation via Self-distilling TransUNet [55.16833099336073]
We propose to self-distill a Transformer-based UNet for medical image segmentation.
It simultaneously learns global semantic information and local spatial-detailed features.
Our MISSU achieves the best performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T07:38:53Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.