MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
Architecture for Medical Image Analysis
- URL: http://arxiv.org/abs/2110.06063v1
- Date: Tue, 12 Oct 2021 15:05:15 GMT
- Title: MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
Architecture for Medical Image Analysis
- Authors: Hossein Aboutalebi, Maya Pavlova, Hayden Gunraj, Mohammad Javad
Shafiee, Ali Sabri, Amer Alaref, Alexander Wong
- Abstract summary: We introduce MEDUSA, a multi-scale encoder-decoder self-attention mechanism tailored for medical image analysis.
We obtain state-of-the-art performance on challenging medical image analysis benchmarks including COVIDx, RSNA RICORD, and RSNA Pneumonia Challenge.
- Score: 71.2022403915147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical image analysis continues to hold interesting challenges given the
subtle characteristics of certain diseases and the significant overlap in
appearance between diseases. In this work, we explore the concept of
self-attention for tackling such subtleties in and between diseases. To this
end, we introduce MEDUSA, a multi-scale encoder-decoder self-attention
mechanism tailored for medical image analysis. While self-attention deep
convolutional neural network architectures in existing literature center around
the notion of multiple isolated lightweight attention mechanisms with limited
individual capacities being incorporated at different points in the network
architecture, MEDUSA takes a significant departure from this notion by
possessing a single, unified self-attention mechanism with significantly higher
capacity with multiple attention heads feeding into different scales in the
network architecture. To the best of the authors' knowledge, this is the first
"single body, multi-scale heads" realization of self-attention and enables
explicit global context amongst selective attention at different levels of
representational abstractions while still enabling differing local attention
context at individual levels of abstractions. With MEDUSA, we obtain
state-of-the-art performance on multiple challenging medical image analysis
benchmarks including COVIDx, RSNA RICORD, and RSNA Pneumonia Challenge when
compared to previous work. Our MEDUSA model is publicly available.
Related papers
- Multi-modal Cross-domain Self-supervised Pre-training for fMRI and EEG Fusion [3.8153469790341084]
We propose a novel approach that leverages self-supervised learning to synergize multi-modal information across domains.
We constructed a large-scale pre-training dataset and pretrained MCSP model by leveraging proposed self-supervised paradigms.
Our study contributes a significant advancement in the fusion of fMRI and EEG, marking a novel integration of cross-domain features.
arXiv Detail & Related papers (2024-09-27T20:25:17Z) - Lagrange Duality and Compound Multi-Attention Transformer for Semi-Supervised Medical Image Segmentation [27.758157788769253]
We propose a Lagrange Duality Consistency (LDC) Loss, integrated with Boundary-Aware Contrastive Loss, as the overall training objective for semi-supervised learning.
We also introduce CMAformer, a novel network that synergizes the strengths of ResUNet and Transformer.
Overall, our results indicate that CMAformer, combined with the feature fusion framework and the new consistency loss, demonstrates strong complementarity in semi-supervised learning ensembles.
arXiv Detail & Related papers (2024-09-12T06:52:46Z) - MindFormer: Semantic Alignment of Multi-Subject fMRI for Brain Decoding [50.55024115943266]
We introduce a novel semantic alignment method of multi-subject fMRI signals using so-called MindFormer.
This model is specifically designed to generate fMRI-conditioned feature vectors that can be used for conditioning Stable Diffusion model for fMRI- to-image generation or large language model (LLM) for fMRI-to-text generation.
Our experimental results demonstrate that MindFormer generates semantically consistent images and text across different subjects.
arXiv Detail & Related papers (2024-05-28T00:36:25Z) - Multi-modal Graph Neural Network for Early Diagnosis of Alzheimer's
Disease from sMRI and PET Scans [11.420077093805382]
We propose to use graph neural networks (GNN) that are designed to deal with problems in non-Euclidean domains.
In this study, we demonstrate how brain networks can be created from sMRI or PET images.
We then present a multi-modal GNN framework where each modality has its own branch of GNN and a technique is proposed to combine the multi-modal data.
arXiv Detail & Related papers (2023-07-31T02:04:05Z) - K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality
Assessment [71.27193056354741]
The problem of how to assess cross-modality medical image synthesis has been largely unexplored.
We propose a new metric K-CROSS to spur progress on this challenging problem.
K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location.
arXiv Detail & Related papers (2023-07-10T01:26:48Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes
for Medical Image Super-Resolution [56.622832383316215]
We propose a novel multi-head convolutional attention module to super-resolve CT and MRI scans.
Our attention module uses the convolution operation to perform joint spatial-channel attention on multiple input tensors.
We introduce multiple attention heads, each head having a distinct receptive field size corresponding to a particular reduction rate for the spatial attention.
arXiv Detail & Related papers (2022-04-08T07:56:55Z) - MHATC: Autism Spectrum Disorder identification utilizing multi-head
attention encoder along with temporal consolidation modules [11.344829880346353]
Resting-state fMRI is commonly used for diagnosing Autism Spectrum Disorder (ASD) by using network-based functional connectivity.
We propose a novel deep learning architecture (MHATC) consisting of multi-head attention and temporal consolidation modules for classifying an individual as a patient of ASD.
arXiv Detail & Related papers (2021-12-27T07:50:16Z) - Studying the Effects of Self-Attention for Medical Image Analysis [42.12044020360494]
We compare various state-of-the-art self-attention mechanisms across multiple medical image analysis tasks.
We aim to provide a deeper understanding of the effects of self-attention in medical computer vision tasks.
arXiv Detail & Related papers (2021-09-02T07:07:16Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.