Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain
- URL: http://arxiv.org/abs/2402.06190v1
- Date: Fri, 9 Feb 2024 05:06:58 GMT
- Title: Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain
- Authors: Amin Karimi Monsefi, Payam Karisani, Mengxi Zhou, Stacey Choi, Nathan
Doble, Heng Ji, Srinivasan Parthasarathy, Rajiv Ramnath
- Abstract summary: We introduce a new neural network architecture, termed LoGoNet, with a tailored self-supervised learning (SSL) method.
LoGoNet integrates a novel feature extractor within a U-shaped architecture, leveraging Large Kernel Attention (LKA) and a dual encoding strategy.
We propose a novel SSL method tailored for 3D images to compensate for the lack of large labeled datasets.
- Score: 48.440691680864745
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Standard modern machine-learning-based imaging methods have faced challenges
in medical applications due to the high cost of dataset construction and,
thereby, the limited labeled training data available. Additionally, upon
deployment, these methods are usually used to process a large volume of data on
a daily basis, imposing a high maintenance cost on medical facilities. In this
paper, we introduce a new neural network architecture, termed LoGoNet, with a
tailored self-supervised learning (SSL) method to mitigate such challenges.
LoGoNet integrates a novel feature extractor within a U-shaped architecture,
leveraging Large Kernel Attention (LKA) and a dual encoding strategy to capture
both long-range and short-range feature dependencies adeptly. This is in
contrast to existing methods that rely on increasing network capacity to
enhance feature extraction. This combination of novel techniques in our model
is especially beneficial in medical image segmentation, given the difficulty of
learning intricate and often irregular body organ shapes, such as the spleen.
Complementary, we propose a novel SSL method tailored for 3D images to
compensate for the lack of large labeled datasets. The method combines masking
and contrastive learning techniques within a multi-task learning framework and
is compatible with both Vision Transformer (ViT) and CNN-based models. We
demonstrate the efficacy of our methods in numerous tasks across two standard
datasets (i.e., BTCV and MSD). Benchmark comparisons with eight
state-of-the-art models highlight LoGoNet's superior performance in both
inference time and accuracy.
Related papers
- TBConvL-Net: A Hybrid Deep Learning Architecture for Robust Medical Image Segmentation [6.013821375459473]
We introduce a novel deep learning architecture for medical image segmentation.
Our proposed model shows consistent improvement over the state of the art on ten publicly available datasets.
arXiv Detail & Related papers (2024-09-05T09:14:03Z) - LiteNeXt: A Novel Lightweight ConvMixer-based Model with Self-embedding Representation Parallel for Medical Image Segmentation [2.0901574458380403]
We propose a new lightweight but efficient model, namely LiteNeXt, for medical image segmentation.
LiteNeXt is trained from scratch with small amount of parameters (0.71M) and Giga Floating Point Operations Per Second (0.42).
arXiv Detail & Related papers (2024-04-04T01:59:19Z) - PMFSNet: Polarized Multi-scale Feature Self-attention Network For
Lightweight Medical Image Segmentation [6.134314911212846]
Current state-of-the-art medical image segmentation methods prioritize accuracy but often at the expense of increased computational demands and larger model sizes.
We propose PMFSNet, a novel medical imaging segmentation model that balances global local feature processing while avoiding computational redundancy.
It incorporates a plug-and-play PMFS block, a multi-scale feature enhancement module based on attention mechanisms, to capture long-term dependencies.
arXiv Detail & Related papers (2024-01-15T10:26:47Z) - Connecting the Dots: Graph Neural Network Powered Ensemble and
Classification of Medical Images [0.0]
Deep learning for medical imaging is limited due to the requirement for large amounts of training data.
We employ the Image Foresting Transform to optimally segment images into superpixels.
These superpixels are subsequently transformed into graph-structured data, enabling the proficient extraction of features and modeling of relationships.
arXiv Detail & Related papers (2023-11-13T13:20:54Z) - LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical
Imaging via Second-order Graph Matching [59.01894976615714]
We introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets.
We have collected approximately 1.3 million medical images from 55 publicly available datasets.
LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models.
arXiv Detail & Related papers (2023-06-20T22:21:34Z) - AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - TSGCNet: Discriminative Geometric Feature Learning with Two-Stream
GraphConvolutional Network for 3D Dental Model Segmentation [141.2690520327948]
We propose a two-stream graph convolutional network (TSGCNet) to learn multi-view information from different geometric attributes.
We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners.
arXiv Detail & Related papers (2020-12-26T08:02:56Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z) - Attention-Guided Version of 2D UNet for Automatic Brain Tumor
Segmentation [2.371982686172067]
Gliomas are the most common and aggressive among brain tumors, which cause a short life expectancy in their highest grade.
Deep convolutional neural networks (DCNNs) have achieved a remarkable performance in brain tumor segmentation.
However, this task is still difficult owing to high varying intensity and appearance of gliomas.
arXiv Detail & Related papers (2020-04-04T20:09:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.