MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
Architecture for Medical Image Analysis
- URL: http://arxiv.org/abs/2110.06063v1
- Date: Tue, 12 Oct 2021 15:05:15 GMT
- Title: MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
Architecture for Medical Image Analysis
- Authors: Hossein Aboutalebi, Maya Pavlova, Hayden Gunraj, Mohammad Javad
Shafiee, Ali Sabri, Amer Alaref, Alexander Wong
- Abstract summary: We introduce MEDUSA, a multi-scale encoder-decoder self-attention mechanism tailored for medical image analysis.
We obtain state-of-the-art performance on challenging medical image analysis benchmarks including COVIDx, RSNA RICORD, and RSNA Pneumonia Challenge.
- Score: 71.2022403915147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical image analysis continues to hold interesting challenges given the
subtle characteristics of certain diseases and the significant overlap in
appearance between diseases. In this work, we explore the concept of
self-attention for tackling such subtleties in and between diseases. To this
end, we introduce MEDUSA, a multi-scale encoder-decoder self-attention
mechanism tailored for medical image analysis. While self-attention deep
convolutional neural network architectures in existing literature center around
the notion of multiple isolated lightweight attention mechanisms with limited
individual capacities being incorporated at different points in the network
architecture, MEDUSA takes a significant departure from this notion by
possessing a single, unified self-attention mechanism with significantly higher
capacity with multiple attention heads feeding into different scales in the
network architecture. To the best of the authors' knowledge, this is the first
"single body, multi-scale heads" realization of self-attention and enables
explicit global context amongst selective attention at different levels of
representational abstractions while still enabling differing local attention
context at individual levels of abstractions. With MEDUSA, we obtain
state-of-the-art performance on multiple challenging medical image analysis
benchmarks including COVIDx, RSNA RICORD, and RSNA Pneumonia Challenge when
compared to previous work. Our MEDUSA model is publicly available.
Related papers
- MindFormer: A Transformer Architecture for Multi-Subject Brain Decoding via fMRI [50.55024115943266]
We introduce a new Transformer architecture called MindFormer to generate fMRI-conditioned feature vectors.
MindFormer incorporates two key innovations: 1) a novel training strategy based on the IP-Adapter to extract semantically meaningful features from fMRI signals, and 2) a subject specific token and linear layer that effectively capture individual differences in fMRI signals.
arXiv Detail & Related papers (2024-05-28T00:36:25Z) - Multi-modal Graph Neural Network for Early Diagnosis of Alzheimer's
Disease from sMRI and PET Scans [11.420077093805382]
We propose to use graph neural networks (GNN) that are designed to deal with problems in non-Euclidean domains.
In this study, we demonstrate how brain networks can be created from sMRI or PET images.
We then present a multi-modal GNN framework where each modality has its own branch of GNN and a technique is proposed to combine the multi-modal data.
arXiv Detail & Related papers (2023-07-31T02:04:05Z) - K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality
Assessment [71.27193056354741]
The problem of how to assess cross-modality medical image synthesis has been largely unexplored.
We propose a new metric K-CROSS to spur progress on this challenging problem.
K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location.
arXiv Detail & Related papers (2023-07-10T01:26:48Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Self-supervised multimodal neuroimaging yields predictive
representations for a spectrum of Alzheimer's phenotypes [27.331511924585023]
This work presents a novel multi-scale coordinated framework for learning multiple representations from multimodal neuroimaging data.
We propose a general taxonomy of informative inductive biases to capture unique and joint information in multimodal self-supervised fusion.
We show that self-supervised models reveal disorder-relevant brain regions and multimodal links without access to the labels during pre-training.
arXiv Detail & Related papers (2022-09-07T01:37:19Z) - Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes
for Medical Image Super-Resolution [56.622832383316215]
We propose a novel multi-head convolutional attention module to super-resolve CT and MRI scans.
Our attention module uses the convolution operation to perform joint spatial-channel attention on multiple input tensors.
We introduce multiple attention heads, each head having a distinct receptive field size corresponding to a particular reduction rate for the spatial attention.
arXiv Detail & Related papers (2022-04-08T07:56:55Z) - MHATC: Autism Spectrum Disorder identification utilizing multi-head
attention encoder along with temporal consolidation modules [11.344829880346353]
Resting-state fMRI is commonly used for diagnosing Autism Spectrum Disorder (ASD) by using network-based functional connectivity.
We propose a novel deep learning architecture (MHATC) consisting of multi-head attention and temporal consolidation modules for classifying an individual as a patient of ASD.
arXiv Detail & Related papers (2021-12-27T07:50:16Z) - Studying the Effects of Self-Attention for Medical Image Analysis [42.12044020360494]
We compare various state-of-the-art self-attention mechanisms across multiple medical image analysis tasks.
We aim to provide a deeper understanding of the effects of self-attention in medical computer vision tasks.
arXiv Detail & Related papers (2021-09-02T07:07:16Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.