ZACH-ViT: A Zero-Token Vision Transformer with ShuffleStrides Data Augmentation for Robust Lung Ultrasound Classification
- URL: http://arxiv.org/abs/2510.17650v1
- Date: Mon, 20 Oct 2025 15:26:38 GMT
- Title: ZACH-ViT: A Zero-Token Vision Transformer with ShuffleStrides Data Augmentation for Robust Lung Ultrasound Classification
- Authors: Athanasios Angelakis, Amne Mousa, Micah L. A. Heldeweg, Laurens A. Biesheuvel, Mark A. Haaksma, Jasper M. Smit, Pieter R. Tuinman, Paul W. G. Elbers,
- Abstract summary: We introduce ZACH-ViT (Zero- parameter Adaptive Compact Hierarchical Vision Transformer), a 0.25 M-Hierarchical Vision Transformer that removes both positional embeddings and variant token.<n>ZACH-ViT was evaluated on 380 LUS videos from 95 critically ill patients against nine state-of-the-art baselines.<n>It achieved the highest validation and test ROC-AUC (0.80 and 0.79) with balanced sensitivity (0.60) and specificity (0.91), while all competing models collapsed to trivial classification.
- Score: 0.7495002546468839
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Differentiating cardiogenic pulmonary oedema (CPE) from non-cardiogenic and structurally normal lungs in lung ultrasound (LUS) videos remains challenging due to the high visual variability of non-cardiogenic inflammatory patterns (NCIP/ARDS-like), interstitial lung disease, and healthy lungs. This heterogeneity complicates automated classification as overlapping B-lines and pleural artefacts are common. We introduce ZACH-ViT (Zero-token Adaptive Compact Hierarchical Vision Transformer), a 0.25 M-parameter Vision Transformer variant that removes both positional embeddings and the [CLS] token, making it fully permutation-invariant and suitable for unordered medical image data. To enhance generalization, we propose ShuffleStrides Data Augmentation (SSDA), which permutes probe-view sequences and frame orders while preserving anatomical validity. ZACH-ViT was evaluated on 380 LUS videos from 95 critically ill patients against nine state-of-the-art baselines. Despite the heterogeneity of the non-cardiogenic group, ZACH-ViT achieved the highest validation and test ROC-AUC (0.80 and 0.79) with balanced sensitivity (0.60) and specificity (0.91), while all competing models collapsed to trivial classification. It trains 1.35x faster than Minimal ViT (0.62M parameters) with 2.5x fewer parameters, supporting real-time clinical deployment. These results show that aligning architectural design with data structure can outperform scale in small-data medical imaging.
Related papers
- Transparent Early ICU Mortality Prediction with Clinical Transformer and Per-Case Modality Attribution [42.85462513661566]
We present a lightweight, transparent multimodal ensemble that fuses physiological time-series measurements with unstructured clinical notes from the first 48 hours of an ICU stay.<n>A logistic regression model combines predictions from two modality-specific models: a bidirectional LSTM for vitals and a finetuned ClinicalModernBERT transformer for notes.<n>On the MIMIC-III benchmark, our late-fusion ensemble improves discrimination over the best single model while maintaining well-calibrated predictions.
arXiv Detail & Related papers (2025-11-19T20:11:49Z) - Cancer-Net PCa-MultiSeg: Multimodal Enhancement of Prostate Cancer Lesion Segmentation Using Synthetic Correlated Diffusion Imaging [55.62977326180104]
Current deep learning approaches for prostate cancer lesion segmentation achieve limited performance.<n>We investigate synthetic correlated diffusion imaging (CDI$s$) as an enhancement to standard diffusion-based protocols.<n>Our results establish validated integration pathways for CDI$s$ as a practical drop-in enhancement for PCa lesion segmentation tasks.
arXiv Detail & Related papers (2025-11-11T04:16:12Z) - A Novel Attention-Augmented Wavelet YOLO System for Real-time Brain Vessel Segmentation on Transcranial Color-coded Doppler [49.03919553747297]
We propose an AI-powered, real-time CoW auto-segmentation system capable of efficiently capturing cerebral arteries.<n>No prior studies have explored AI-driven cerebrovascular segmentation using Transcranial Color-coded Doppler (TCCD)<n>The proposed AAW-YOLO demonstrated strong performance in segmenting both ipsilateral and contralateral CoW vessels.
arXiv Detail & Related papers (2025-08-19T14:41:22Z) - Pretrained hybrid transformer for generalizable cardiac substructures segmentation from contrast and non-contrast CTs in lung and breast cancers [3.704003490598663]
AI automated segmentations for radiation treatment planning (RTP) can deteriorate when applied in clinical cases with different characteristics than training dataset.<n>We refined a pretrained transformer into a hybrid transformer convolutional network (HTN) to segment cardiac substructures lung and breast cancer patients.<n>A HTN demonstrated robustly accurate (geometric and dose metrics) cardiac substructures segmentation from CTs with varying imaging and patient characteristics.
arXiv Detail & Related papers (2025-05-16T04:48:33Z) - PCMC-T1: Free-breathing myocardial T1 mapping with
Physically-Constrained Motion Correction [15.251935193140982]
We introduce PCMC-T1, a physically-constrained deep-learning model for motion correction in free-breathing T1 mapping.
We incorporate the signal decay model into the network architecture to encourage physically-plausible deformations along the longitudinal relaxation axis.
arXiv Detail & Related papers (2023-08-22T08:50:38Z) - Self-supervised contrastive learning of echocardiogram videos enables
label-efficient cardiac disease diagnosis [48.64462717254158]
We developed a self-supervised contrastive learning approach, EchoCLR, to catered to echocardiogram videos.
When fine-tuned on small portions of labeled data, EchoCLR pretraining significantly improved classification performance for left ventricular hypertrophy (LVH) and aortic stenosis (AS)
EchoCLR is unique in its ability to learn representations of medical videos and demonstrates that SSL can enable label-efficient disease classification from small, labeled datasets.
arXiv Detail & Related papers (2022-07-23T19:17:26Z) - CAE-Transformer: Transformer-based Model to Predict Invasiveness of Lung
Adenocarcinoma Subsolid Nodules from Non-thin Section 3D CT Scans [36.093580055848186]
Lung Adenocarcinoma (LAUC) has recently been the most prevalent.
Timely and accurate knowledge of the invasiveness of lung nodules leads to a proper treatment plan and reduces the risk of unnecessary or late surgeries.
The primary imaging modality to assess and predict the invasiveness of LAUCs is the chest CT.
In this paper, a predictive transformer-based framework, referred to as the "CAE-Transformer", is developed to classify LAUCs.
arXiv Detail & Related papers (2021-10-17T04:37:24Z) - Multi-Slice Net: A novel light weight framework for COVID-19 Diagnosis [38.32234937094937]
This paper presents a novel lightweight COVID-19 diagnosis framework using CT scans.
We use a powerful backbone network as a feature extractor to capture discriminative slice-level features.
These features are aggregated by a lightweight network to obtain a patient level diagnosis.
arXiv Detail & Related papers (2021-08-09T02:46:11Z) - Lung Ultrasound Segmentation and Adaptation between COVID-19 and
Community-Acquired Pneumonia [0.17159130619349347]
We focus on the hyperechoic B-line segmentation task using deep neural networks.
We utilize both COVID-19 and CAP lung ultrasound data to train the networks.
Segmenting either type of lung condition at inference may support a range of clinical applications.
arXiv Detail & Related papers (2021-08-06T14:17:51Z) - Quantification of pulmonary involvement in COVID-19 pneumonia by means
of a cascade oftwo U-nets: training and assessment on multipledatasets using
different annotation criteria [83.83783947027392]
This study aims at exploiting Artificial intelligence (AI) for the identification, segmentation and quantification of COVID-19 pulmonary lesions.
We developed an automated analysis pipeline, the LungQuant system, based on a cascade of two U-nets.
The accuracy in predicting the CT-Severity Score (CT-SS) of the LungQuant system has been also evaluated.
arXiv Detail & Related papers (2021-05-06T10:21:28Z) - Rapid quantification of COVID-19 pneumonia burden from computed
tomography with convolutional LSTM networks [1.0072268949897432]
We propose a new fully automated deep learning framework for rapid quantification and differentiation between lung lesions in COVID-19 pneumonia.
The performance of the method was evaluated on CT data sets from 197 patients with positive reverse transcription polymerase chain reaction test result for SARS-CoV-2.
arXiv Detail & Related papers (2021-03-31T22:09:14Z) - Segmentation of Pulmonary Opacification in Chest CT Scans of COVID-19
Patients [3.140265238474236]
We provide open source models for the segmentation of patterns of pulmonary opacification on chest Computed Tomography (CT) scans.
We have collected 663 chest CT scans of COVID-19 patients from healthcare centers around the world.
Our best model achieves an opacity Intersection-Over-Union score of 0.76 on our test set, demonstrates successful domain adaptation, and predicts the volume of opacification within 1.7% of expert radiologists.
arXiv Detail & Related papers (2020-07-07T17:32:24Z) - Co-Heterogeneous and Adaptive Segmentation from Multi-Source and
Multi-Phase CT Imaging Data: A Study on Pathological Liver and Lesion
Segmentation [48.504790189796836]
We present a novel segmentation strategy, co-heterogenous and adaptive segmentation (CHASe)
We propose a versatile framework that fuses appearance based semi-supervision, mask based adversarial domain adaptation, and pseudo-labeling.
CHASe can further improve pathological liver mask Dice-Sorensen coefficients by ranges of $4.2% sim 9.4%$.
arXiv Detail & Related papers (2020-05-27T06:58:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.