Self-learned representation-guided latent diffusion model for breast cancer classification in deep ultraviolet whole surface images
- URL: http://arxiv.org/abs/2601.10917v1
- Date: Fri, 16 Jan 2026 00:22:22 GMT
- Title: Self-learned representation-guided latent diffusion model for breast cancer classification in deep ultraviolet whole surface images
- Authors: Pouya Afshin, David Helminiak, Tianling Niu, Julie M. Jorns, Tina Yen, Bing Yu, Dong Hye Ye,
- Abstract summary: We propose an Self-Supervised Learning (SSL)-guided Latent Model (LDM) to generate high-quality synthetic training patches.<n>By guiding the LDM with embeddings from a fine-tuned DINO teacher, we inject rich semantic details of cellular structures into the synthetic data.<n> Experiments using 5-fold cross-validation demonstrate that our method achieves 96.47 % accuracy and reduces the FID score to 45.72, significantly outperforming class-conditioned baselines.
- Score: 4.203807616568477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Breast-Conserving Surgery (BCS) requires precise intraoperative margin assessment to preserve healthy tissue. Deep Ultraviolet Fluorescence Scanning Microscopy (DUV-FSM) offers rapid, high-resolution surface imaging for this purpose; however, the scarcity of annotated DUV data hinders the training of robust deep learning models. To address this, we propose an Self-Supervised Learning (SSL)-guided Latent Diffusion Model (LDM) to generate high-quality synthetic training patches. By guiding the LDM with embeddings from a fine-tuned DINO teacher, we inject rich semantic details of cellular structures into the synthetic data. We combine real and synthetic patches to fine-tune a Vision Transformer (ViT), utilizing patch prediction aggregation for WSI-level classification. Experiments using 5-fold cross-validation demonstrate that our method achieves 96.47 % accuracy and reduces the FID score to 45.72, significantly outperforming class-conditioned baselines.
Related papers
- DerMAE: Improving skin lesion classification through conditioned latent diffusion and MAE distillation [1.485045763113618]
We use class-conditioned diffusion models to generate synthetic dermatological images, followed by self-supervised MAE pretraining to enable huge ViT models to learn robust, domain-relevant features.<n>We apply knowledge distillation to transfer these representations to a smaller ViT student suitable for mobile devices.<n>Our results show that MAE pretraining on synthetic data, combined with distillation, improves classification performance while enabling efficient on-device inference for practical clinical use.
arXiv Detail & Related papers (2026-02-23T13:52:28Z) - Subtyping Breast Lesions via Generative Augmentation based Long-tailed Recognition in Ultrasound [8.410718166932798]
We propose a framework for long-tailed classification that mitigates distributional bias through high-fidelity data synthesis.<n>Our method achieves promising performance compared to state-of-the-art approaches.
arXiv Detail & Related papers (2025-07-30T10:50:41Z) - Attention-Enhanced Deep Learning Ensemble for Breast Density Classification in Mammography [0.0]
This study proposes an automated deep learning system for robust binary classification of breast density.<n>We implemented and compared four advanced convolutional neural networks.<n>We developed a novel Combined Focal Label Smoothing Loss function that integrates focal loss, label smoothing, and class-balanced weighting.
arXiv Detail & Related papers (2025-07-08T21:26:33Z) - Comparison of ConvNeXt and Vision-Language Models for Breast Density Assessment in Screening Mammography [39.58317527488534]
This study compares multimodal and CNN-based methods for automated classification using the BI-RADS system.<n>Zero-shot classification achieved modest performance, while the fine-tuned ConvNeXt model outperformed the BioMedCLIP linear probe.<n>These findings suggest that despite the promise of multimodal learning, CNN-based models with end-to-end fine-tuning provide stronger performance for specialized medical imaging.
arXiv Detail & Related papers (2025-06-16T20:14:37Z) - Prompt-Guided Latent Diffusion with Predictive Class Conditioning for 3D Prostate MRI Generation [1.6508709227918446]
Latent diffusion models (LDM) could alleviate data scarcity challenges affecting machine learning development for medical imaging.<n>We propose a novel LDM conditioning approach to address these limitations.<n>Our method achieves a 3D FID score of 0.025 on a size-limited 3D prostate MRI dataset.
arXiv Detail & Related papers (2025-06-11T23:12:48Z) - Breast Cancer Classification in Deep Ultraviolet Fluorescence Images Using a Patch-Level Vision Transformer Framework [6.0791593833288085]
A deep ultraviolet fluorescence scanning microscope (DUV-FSM) enables rapid acquisition of whole surface images (WSIs) for excised tissue.<n>This study introduces a DUV WSI classification framework using a patch-level vision transformer (ViT) model, capturing local and global features.<n>A comprehensive 5-fold cross-validation demonstrates the proposed approach significantly outperforms conventional deep learning methods, achieving a classification accuracy of 98.33%.
arXiv Detail & Related papers (2025-05-12T15:22:54Z) - Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections [35.585690280385826]
We adapt the Masked Autoencoder for federated learning, enhancing Sharpness-Aware Minimization (FedSAM) and Weight Averaging.<n>Our findings demonstrate that integrating FedSAM into the federated MAE approach improves pretraining, leading to a reduction in reconstruction loss per patch.<n>These findings highlight the potential of federated learning for privacy-preserving training of surgical foundation models.
arXiv Detail & Related papers (2025-04-23T10:54:32Z) - Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging [41.446379453352534]
Latent Diffusion Autoencoder (LDAE) is a novel encoder-decoder diffusion-based framework for efficient and meaningful unsupervised learning in medical imaging.<n>This study focuses on Alzheimer disease (AD) using brain MR from the ADNI database as a case study.
arXiv Detail & Related papers (2025-04-11T15:37:46Z) - Beyond Synthetic Replays: Turning Diffusion Features into Few-Shot Class-Incremental Learning Knowledge [36.22704733553466]
Few-shot class-incremental learning (FSCIL) is challenging due to extremely limited training data.<n>Recent works have explored generative models, particularly Stable Diffusion (SD) to address these challenges.<n>We introduce Diffusion-FSCIL, which extracts four synergistic feature types from SD by capturing real image characteristics.
arXiv Detail & Related papers (2025-03-30T11:20:08Z) - Deep learning for automated detection of breast cancer in deep ultraviolet fluorescence images with diffusion probabilistic model [6.658963545934998]
diffusion probabilistic model (DPM) has shown potential to generate high-quality images.
In this paper, we apply DPM to augment the deep ultraviolet fluorescence (DUV) image dataset with an aim to improve breast cancer classification.
arXiv Detail & Related papers (2024-07-01T05:00:26Z) - Semantic Latent Space Regression of Diffusion Autoencoders for Vertebral
Fracture Grading [72.45699658852304]
This paper proposes a novel approach to train a generative Diffusion Autoencoder model as an unsupervised feature extractor.
We model fracture grading as a continuous regression, which is more reflective of the smooth progression of fractures.
Importantly, the generative nature of our method allows us to visualize different grades of a given vertebra, providing interpretability and insight into the features that contribute to automated grading.
arXiv Detail & Related papers (2023-03-21T17:16:01Z) - Successive Subspace Learning for Cardiac Disease Classification with
Two-phase Deformation Fields from Cine MRI [36.044984400761535]
This work proposes a lightweight successive subspace learning framework for CVD classification.
It is based on an interpretable feedforward design, in conjunction with a cardiac atlas.
Compared with 3D CNN-based approaches, our framework achieves superior classification performance with 140$times$ fewer parameters.
arXiv Detail & Related papers (2023-01-21T15:00:59Z) - Boundary Guided Semantic Learning for Real-time COVID-19 Lung Infection
Segmentation System [69.40329819373954]
The coronavirus disease 2019 (COVID-19) continues to have a negative impact on healthcare systems around the world.
At the current stage, automatically segmenting the lung infection area from CT images is essential for the diagnosis and treatment of COVID-19.
We propose a boundary guided semantic learning network (BSNet) in this paper.
arXiv Detail & Related papers (2022-09-07T05:01:38Z) - Deep Implicit Statistical Shape Models for 3D Medical Image Delineation [47.78425002879612]
3D delineation of anatomical structures is a cardinal goal in medical imaging analysis.
Prior to deep learning, statistical shape models that imposed anatomical constraints and produced high quality surfaces were a core technology.
We present deep implicit statistical shape models (DISSMs), a new approach to delineation that marries the representation power of CNNs with the robustness of SSMs.
arXiv Detail & Related papers (2021-04-07T01:15:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.