Mutual Attention-based Hybrid Dimensional Network for Multimodal Imaging
Computer-aided Diagnosis
- URL: http://arxiv.org/abs/2201.09421v1
- Date: Mon, 24 Jan 2022 02:31:25 GMT
- Title: Mutual Attention-based Hybrid Dimensional Network for Multimodal Imaging
Computer-aided Diagnosis
- Authors: Yin Dai, Yifan Gao, Fayu Liu and Jun Fu
- Abstract summary: We propose a novel mutual attention-based hybrid dimensional network for MultiModal 3D medical image classification (MMNet)
The hybrid dimensional network integrates 2D CNN with 3D convolution modules to generate deeper and more informative feature maps.
We further design a mutual attention framework in the network to build the region-wise consistency in similar stereoscopic regions of different image modalities.
- Score: 4.657804635843888
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent works on Multimodal 3D Computer-aided diagnosis have demonstrated that
obtaining a competitive automatic diagnosis model when a 3D convolution neural
network (CNN) brings more parameters and medical images are scarce remains
nontrivial and challenging. Considering both consistencies of regions of
interest in multimodal images and diagnostic accuracy, we propose a novel
mutual attention-based hybrid dimensional network for MultiModal 3D medical
image classification (MMNet). The hybrid dimensional network integrates 2D CNN
with 3D convolution modules to generate deeper and more informative feature
maps, and reduce the training complexity of 3D fusion. Besides, the pre-trained
model of ImageNet can be used in 2D CNN, which improves the performance of the
model. The stereoscopic attention is focused on building rich contextual
interdependencies of the region in 3D medical images. To improve the regional
correlation of pathological tissues in multimodal medical images, we further
design a mutual attention framework in the network to build the region-wise
consistency in similar stereoscopic regions of different image modalities,
providing an implicit manner to instruct the network to focus on pathological
tissues. MMNet outperforms many previous solutions and achieves results
competitive to the state-of-the-art on three multimodal imaging datasets, i.e.,
Parotid Gland Tumor (PGT) dataset, the MRNet dataset, and the PROSTATEx
dataset, and its advantages are validated by extensive experiments.
Related papers
- QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge [93.61262892578067]
Uncertainty in medical image segmentation tasks, especially inter-rater variability, presents a significant challenge.
This variability directly impacts the development and evaluation of automated segmentation algorithms.
We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ)
arXiv Detail & Related papers (2024-03-19T17:57:24Z) - SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion
Classification Using 3D Multi-Phase Imaging [59.78761085714715]
This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework for liver lesion classification.
The proposed framework has been validated through comprehensive experiments on two clinical datasets.
To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public.
arXiv Detail & Related papers (2024-02-27T06:32:56Z) - Enhancing CT Image synthesis from multi-modal MRI data based on a
multi-task neural network framework [16.864720020158906]
We propose a versatile multi-task neural network framework, based on an enhanced Transformer U-Net architecture.
We decompose the traditional problem of synthesizing CT images into distinct subtasks.
To enhance the framework's versatility in handling multi-modal data, we expand the model with multiple image channels.
arXiv Detail & Related papers (2023-12-13T18:22:38Z) - Three-Dimensional Medical Image Fusion with Deformable Cross-Attention [10.26573411162757]
Multimodal medical image fusion plays an instrumental role in several areas of medical image processing.
Traditional fusion methods tend to process each modality independently before combining the features and reconstructing the fusion image.
In this study, we introduce an innovative unsupervised feature mutual learning fusion network designed to rectify these limitations.
arXiv Detail & Related papers (2023-10-10T04:10:56Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - Decomposing 3D Neuroimaging into 2+1D Processing for Schizophrenia
Recognition [25.80846093248797]
We propose to process the 3D data by a 2+1D framework so that we can exploit the powerful deep 2D Convolutional Neural Network (CNN) networks pre-trained on the huge ImageNet dataset for 3D neuroimaging recognition.
Specifically, 3D volumes of Magnetic Resonance Imaging (MRI) metrics are decomposed to 2D slices according to neighboring voxel positions.
Global pooling is applied to remove redundant information as the activation patterns are sparsely distributed over feature maps.
Channel-wise and slice-wise convolutions are proposed to aggregate the contextual information in the third dimension unprocessed by the 2D CNN model.
arXiv Detail & Related papers (2022-11-21T15:22:59Z) - Dual Multi-scale Mean Teacher Network for Semi-supervised Infection
Segmentation in Chest CT Volume for COVID-19 [76.51091445670596]
Automated detecting lung infections from computed tomography (CT) data plays an important role for combating COVID-19.
Most current COVID-19 infection segmentation methods mainly relied on 2D CT images, which lack 3D sequential constraint.
Existing 3D CT segmentation methods focus on single-scale representations, which do not achieve the multiple level receptive field sizes on 3D volume.
arXiv Detail & Related papers (2022-11-10T13:11:21Z) - Slice-level Detection of Intracranial Hemorrhage on CT Using Deep
Descriptors of Adjacent Slices [0.31317409221921133]
We propose a new strategy to train emphslice-level classifiers on CT scans based on the descriptors of the adjacent slices along the axis.
We obtain a single model in the top 4% best-performing solutions of the RSNA Intracranial Hemorrhage dataset challenge.
The proposed method is general and can be applied to other 3D medical diagnosis tasks such as MRI imaging.
arXiv Detail & Related papers (2022-08-05T23:20:37Z) - Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric
Segmentation [13.158995287578316]
We propose a dynamic architecture network named Med-DANet to achieve effective accuracy and efficiency trade-off.
For each slice of the input 3D MRI volume, our proposed method learns a slice-specific decision by the Decision Network.
Our proposed method achieves comparable or better results than previous state-of-the-art methods for 3D MRI brain tumor segmentation.
arXiv Detail & Related papers (2022-06-14T03:25:58Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - M2Net: Multi-modal Multi-channel Network for Overall Survival Time
Prediction of Brain Tumor Patients [151.4352001822956]
Early and accurate prediction of overall survival (OS) time can help to obtain better treatment planning for brain tumor patients.
Existing prediction methods rely on radiomic features at the local lesion area of a magnetic resonance (MR) volume.
We propose an end-to-end OS time prediction model; namely, Multi-modal Multi-channel Network (M2Net)
arXiv Detail & Related papers (2020-06-01T05:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.