MedViT: A Robust Vision Transformer for Generalized Medical Image
  Classification
        - URL: http://arxiv.org/abs/2302.09462v1
- Date: Sun, 19 Feb 2023 02:55:45 GMT
- Title: MedViT: A Robust Vision Transformer for Generalized Medical Image
  Classification
- Authors: Omid Nejati Manzari, Hamid Ahmadabadi, Hossein Kashiani, Shahriar B.
  Shokouhi, Ahmad Ayatollahi
- Abstract summary: We propose a robust yet efficient CNN-Transformer hybrid model which is equipped with the locality of CNNs and the global connectivity of vision Transformers.
Our proposed hybrid model demonstrates its high robustness and generalization ability compared to the state-of-the-art studies on a large-scale collection of standardized MedMNIST-2D datasets.
- Score: 4.471084427623774
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract:   Convolutional Neural Networks (CNNs) have advanced existing medical systems
for automatic disease diagnosis. However, there are still concerns about the
reliability of deep medical diagnosis systems against the potential threats of
adversarial attacks since inaccurate diagnosis could lead to disastrous
consequences in the safety realm. In this study, we propose a highly robust yet
efficient CNN-Transformer hybrid model which is equipped with the locality of
CNNs as well as the global connectivity of vision Transformers. To mitigate the
high quadratic complexity of the self-attention mechanism while jointly
attending to information in various representation subspaces, we construct our
attention mechanism by means of an efficient convolution operation. Moreover,
to alleviate the fragility of our Transformer model against adversarial
attacks, we attempt to learn smoother decision boundaries. To this end, we
augment the shape information of an image in the high-level feature space by
permuting the feature mean and variance within mini-batches. With less
computational complexity, our proposed hybrid model demonstrates its high
robustness and generalization ability compared to the state-of-the-art studies
on a large-scale collection of standardized MedMNIST-2D datasets.
 
      
        Related papers
        - Vision Transformer for Intracranial Hemorrhage Classification in CT   Scans Using an Entropy-Aware Fuzzy Integral Strategy for Adaptive Scan-Level   Decision Fusion [5.486205584465161]
 Intracranial hemorrhage (ICH) is a critical medical emergency caused by the rupture of cerebral blood vessels, leading to internal bleeding within the skull.
We propose an advanced pyramid vision transformer (PVT)-based model, leveraging its hierarchical attention mechanisms to capture both local and global spatial dependencies in brain CT scans.
 arXiv  Detail & Related papers  (2025-03-11T16:47:32Z)
- GS-TransUNet: Integrated 2D Gaussian Splatting and Transformer UNet for   Accurate Skin Lesion Analysis [44.99833362998488]
 We present a novel approach that combines 2D Gaussian splatting with the Transformer UNet architecture for automated skin cancer diagnosis.
Our findings illustrate significant advancements in the precision of segmentation and classification.
This integration sets new benchmarks in the field and highlights the potential for further research into multi-task medical image analysis methodologies.
 arXiv  Detail & Related papers  (2025-02-23T23:28:47Z)
- Multi-Scale Transformer Architecture for Accurate Medical Image   Classification [4.578375402082224]
 This study introduces an AI-driven skin lesion classification algorithm built on an enhanced Transformer architecture.
By integrating a multi-scale feature fusion mechanism and refining the self-attention process, the model effectively extracts both global and local features.
Performance evaluation on the ISIC 2017 dataset demonstrates that the improved Transformer surpasses established AI models.
 arXiv  Detail & Related papers  (2025-02-10T08:22:25Z)
- TransUNext: towards a more advanced U-shaped framework for automatic   vessel segmentation in the fundus image [19.16680702780529]
 We propose a more advanced U-shaped architecture for a hybrid Transformer and CNN: TransUNext.
The Global Multi-Scale Fusion (GMSF) module is further introduced to upgrade skip-connections, fuse high-level semantic and low-level detailed information, and eliminate high- and low-level semantic differences.
 arXiv  Detail & Related papers  (2024-11-05T01:44:22Z)
- A Unified Model for Compressed Sensing MRI Across Undersampling Patterns [69.19631302047569]
 We propose a unified MRI reconstruction model robust to various measurement undersampling patterns and image resolutions.
Our model improves SSIM by 11% and PSNR by 4 dB over a state-of-the-art CNN (End-to-End VarNet) with 600$times$ faster inference than diffusion methods.
 arXiv  Detail & Related papers  (2024-10-05T20:03:57Z)
- TBConvL-Net: A Hybrid Deep Learning Architecture for Robust Medical   Image Segmentation [6.013821375459473]
 We introduce a novel deep learning architecture for medical image segmentation.
Our proposed model shows consistent improvement over the state of the art on ten publicly available datasets.
 arXiv  Detail & Related papers  (2024-09-05T09:14:03Z)
- Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation   in DCE-MRI [58.809276442508256]
 We propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers.
The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network superior performance than the state-of-the-art methods.
 arXiv  Detail & Related papers  (2024-08-11T15:46:00Z)
- CAF-YOLO: A Robust Framework for Multi-Scale Lesion Detection in   Biomedical Imagery [0.0682074616451595]
 CAF-YOLO is a nimble yet robust method for medical object detection that leverages the strengths of convolutional neural networks (CNNs) and transformers.
ACFM module enhances the modeling of both global and local features, enabling the capture of long-term feature dependencies.
MSNN improves multi-scale information aggregation by extracting features across diverse scales.
 arXiv  Detail & Related papers  (2024-08-04T01:44:44Z)
- L-SFAN: Lightweight Spatially-focused Attention Network for Pain   Behavior Detection [44.016805074560295]
 Chronic Low Back Pain (CLBP) afflicts millions globally, significantly impacting individuals' well-being and imposing economic burdens on healthcare systems.
While artificial intelligence (AI) and deep learning offer promising avenues for analyzing pain-related behaviors to improve rehabilitation strategies, current models, including convolutional neural networks (CNNs), have limitations.
We introduce hbox EmoL-SFAN, a lightweight CNN architecture incorporating 2D filters designed to capture the spatial-temporal interplay of data from motion capture and surface electromyography sensors.
 arXiv  Detail & Related papers  (2024-06-07T12:01:37Z)
- Harnessing The Power of Attention For Patch-Based Biomedical Image   Classification [0.0]
 We present a novel architecture based on self-attention mechanisms as an alternative to conventional CNNs.
We introduce the Lancoz5 technique, which adapts variable image sizes to higher resolutions.
Our methods address critical challenges faced by attention-based vision models, including inductive bias, weight sharing, receptive field limitations, and efficient data handling.
 arXiv  Detail & Related papers  (2024-04-01T06:22:28Z)
- Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
 We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
 Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
 arXiv  Detail & Related papers  (2023-10-22T02:27:02Z)
- Breast Ultrasound Tumor Classification Using a Hybrid Multitask
  CNN-Transformer Network [63.845552349914186]
 Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
 Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
 arXiv  Detail & Related papers  (2023-08-04T01:19:32Z)
- Brain Imaging-to-Graph Generation using Adversarial Hierarchical   Diffusion Models for MCI Causality Analysis [44.45598796591008]
 Brain imaging-to-graph generation (BIGG) framework is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment analysis.
The hierarchical transformers in the generator are designed to estimate the noise at multiple scales.
 Evaluations of the ADNI dataset demonstrate the feasibility and efficacy of the proposed model.
 arXiv  Detail & Related papers  (2023-05-18T06:54:56Z)
- AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context
  Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
 We present a novel concept of shared-context processing for whole slide histopathology images.
AMIGO uses the celluar graph within the tissue to provide a single representation for a patient.
We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
 arXiv  Detail & Related papers  (2023-03-01T23:37:45Z)
- Self-Supervised Masked Convolutional Transformer Block for Anomaly
  Detection [122.4894940892536]
 We present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level.
In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss.
 arXiv  Detail & Related papers  (2022-09-25T04:56:10Z)
- Differentiable Agent-based Epidemiology [71.81552021144589]
 We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
 GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
 arXiv  Detail & Related papers  (2022-07-20T07:32:02Z)
- MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of
  Pancreatic Cancer [5.604939010661757]
 Pancreatic cancer is one of the most malignant cancers in the world, which deteriorates rapidly with very high mortality.
We propose a hybrid high-performance deep learning model to enable the automated workflow.
A dataset of 4240 ROSE images is collected to evaluate the method in this unexplored field.
 arXiv  Detail & Related papers  (2021-12-27T05:04:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.