MV-Swin-T: Mammogram Classification with Multi-view Swin Transformer
- URL: http://arxiv.org/abs/2402.16298v1
- Date: Mon, 26 Feb 2024 04:41:04 GMT
- Title: MV-Swin-T: Mammogram Classification with Multi-view Swin Transformer
- Authors: Sushmita Sarker, Prithul Sarker, George Bebis, and Alireza Tavakkoli
- Abstract summary: We propose an innovative multi-view network based on transformers to address challenges in mammographic image classification.
Our approach introduces a novel shifted window-based dynamic attention block, facilitating the effective integration of multi-view information.
- Score: 0.257133335028485
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional deep learning approaches for breast cancer classification has
predominantly concentrated on single-view analysis. In clinical practice,
however, radiologists concurrently examine all views within a mammography exam,
leveraging the inherent correlations in these views to effectively detect
tumors. Acknowledging the significance of multi-view analysis, some studies
have introduced methods that independently process mammogram views, either
through distinct convolutional branches or simple fusion strategies,
inadvertently leading to a loss of crucial inter-view correlations. In this
paper, we propose an innovative multi-view network exclusively based on
transformers to address challenges in mammographic image classification. Our
approach introduces a novel shifted window-based dynamic attention block,
facilitating the effective integration of multi-view information and promoting
the coherent transfer of this information between views at the spatial feature
map level. Furthermore, we conduct a comprehensive comparative analysis of the
performance and effectiveness of transformer-based models under diverse
settings, employing the CBIS-DDSM and Vin-Dr Mammo datasets. Our code is
publicly available at https://github.com/prithuls/MV-Swin-T
Related papers
- MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report [4.340464264725625]
We introduce a novel Multi-Modal Contrastive Pre-training Framework that synergistically combines X-rays, electrocardiograms (ECGs) and radiology/cardiology reports.
We utilize LoRA-Peft to significantly reduce trainable parameters in the LLM and incorporate recent linear attention dropping strategy in the Vision Transformer(ViT) for smoother attention.
To the best of our knowledge, we are the first to propose an integrated model that combines X-ray, ECG, and Radiology/Cardiology Report with this approach.
arXiv Detail & Related papers (2024-10-21T17:42:41Z) - Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions.
We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training.
Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z) - C^2M-DoT: Cross-modal consistent multi-view medical report generation
with domain transfer network [67.97926983664676]
We propose a cross-modal consistent multi-view medical report generation with a domain transfer network (C2M-DoT)
C2M-DoT substantially outperforms state-of-the-art baselines in all metrics.
arXiv Detail & Related papers (2023-10-09T02:31:36Z) - SwinMM: Masked Multi-view with Swin Transformers for 3D Medical Image
Segmentation [32.092182889440814]
We present Masked Multi-view with Swin Transformers (SwinMM), a novel multi-view pipeline for medical image analysis.
In the pre-training phase, we deploy a masked multi-view encoder devised to concurrently train masked multi-view observations.
A new task capitalizes on the consistency between predictions from various perspectives, enabling the extraction of hidden multi-view information.
arXiv Detail & Related papers (2023-07-24T08:06:46Z) - The Whole Pathological Slide Classification via Weakly Supervised
Learning [7.313528558452559]
We introduce two pathological priors: nuclear disease of cells and spatial correlation of pathological tiles.
We propose a data augmentation method that utilizes stain separation during extractor training.
We then describe the spatial relationships between the tiles using an adjacency matrix.
By integrating these two views, we designed a multi-instance framework for analyzing H&E-stained tissue images.
arXiv Detail & Related papers (2023-07-12T16:14:23Z) - Beyond CNNs: Exploiting Further Inherent Symmetries in Medical Image
Segmentation [21.6412682130116]
We propose a novel group equivariant segmentation framework by encoding those inherent symmetries for learning more precise representations.
Based on our novel framework, extensive experiments conducted on real-world clinical data demonstrate that a Group Equivariant Res-UNet (named GER-UNet) outperforms its regular CNN-based counterpart.
The newly built GER-UNet also shows potential in reducing the sample complexity and the redundancy of filters.
arXiv Detail & Related papers (2022-07-29T04:28:20Z) - Superficial White Matter Analysis: An Efficient Point-cloud-based Deep
Learning Framework with Supervised Contrastive Learning for Consistent
Tractography Parcellation across Populations and dMRI Acquisitions [68.41088365582831]
White matter parcellation classifies tractography streamlines into clusters or anatomically meaningful tracts.
Most parcellation methods focus on the deep white matter (DWM), whereas fewer methods address the superficial white matter (SWM) due to its complexity.
We propose a novel two-stage deep-learning-based framework, Superficial White Matter Analysis (SupWMA), that performs an efficient parcellation of 198 SWM clusters from whole-brain tractography.
arXiv Detail & Related papers (2022-07-18T23:07:53Z) - Multi-View Hypercomplex Learning for Breast Cancer Screening [7.147856898682969]
Traditionally, deep learning methods for breast cancer classification perform a single-view analysis.
radiologists simultaneously analyze all four views that compose a mammography exam.
We propose a methodological approach for multi-view breast cancer classification based on parameterized hypercomplex neural networks.
arXiv Detail & Related papers (2022-04-12T13:32:31Z) - Incremental Cross-view Mutual Distillation for Self-supervised Medical
CT Synthesis [88.39466012709205]
This paper builds a novel medical slice to increase the between-slice resolution.
Considering that the ground-truth intermediate medical slices are always absent in clinical practice, we introduce the incremental cross-view mutual distillation strategy.
Our method outperforms state-of-the-art algorithms by clear margins.
arXiv Detail & Related papers (2021-12-20T03:38:37Z) - Few-shot Medical Image Segmentation using a Global Correlation Network
with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation.
We construct our few-shot image segmentor using a deep convolutional network trained episodically.
We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z) - Weakly supervised multiple instance learning histopathological tumor
segmentation [51.085268272912415]
We propose a weakly supervised framework for whole slide imaging segmentation.
We exploit a multiple instance learning scheme for training models.
The proposed framework has been evaluated on multi-locations and multi-centric public data from The Cancer Genome Atlas and the PatchCamelyon dataset.
arXiv Detail & Related papers (2020-04-10T13:12:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.