Related papers: Self-learned representation-guided latent diffusion model for breast cancer classification in deep ultraviolet whole surface images

Self-learned representation-guided latent diffusion model for breast cancer classification in deep ultraviolet whole surface images

URL: http://arxiv.org/abs/2601.10917v1
Date: Fri, 16 Jan 2026 00:22:22 GMT
Title: Self-learned representation-guided latent diffusion model for breast cancer classification in deep ultraviolet whole surface images
Authors: Pouya Afshin, David Helminiak, Tianling Niu, Julie M. Jorns, Tina Yen, Bing Yu, Dong Hye Ye,
Abstract summary: We propose an Self-Supervised Learning (SSL)-guided Latent Model (LDM) to generate high-quality synthetic training patches.<n>By guiding the LDM with embeddings from a fine-tuned DINO teacher, we inject rich semantic details of cellular structures into the synthetic data.<n> Experiments using 5-fold cross-validation demonstrate that our method achieves 96.47 % accuracy and reduces the FID score to 45.72, significantly outperforming class-conditioned baselines.
Score: 4.203807616568477
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Breast-Conserving Surgery (BCS) requires precise intraoperative margin assessment to preserve healthy tissue. Deep Ultraviolet Fluorescence Scanning Microscopy (DUV-FSM) offers rapid, high-resolution surface imaging for this purpose; however, the scarcity of annotated DUV data hinders the training of robust deep learning models. To address this, we propose an Self-Supervised Learning (SSL)-guided Latent Diffusion Model (LDM) to generate high-quality synthetic training patches. By guiding the LDM with embeddings from a fine-tuned DINO teacher, we inject rich semantic details of cellular structures into the synthetic data. We combine real and synthetic patches to fine-tune a Vision Transformer (ViT), utilizing patch prediction aggregation for WSI-level classification. Experiments using 5-fold cross-validation demonstrate that our method achieves 96.47 % accuracy and reduces the FID score to 45.72, significantly outperforming class-conditioned baselines.

Related papers

DerMAE: Improving skin lesion classification through conditioned latent diffusion and MAE distillation [1.485045763113618]
We use class-conditioned diffusion models to generate synthetic dermatological images, followed by self-supervised MAE pretraining to enable huge ViT models to learn robust, domain-relevant features.<n>We apply knowledge distillation to transfer these representations to a smaller ViT student suitable for mobile devices.<n>Our results show that MAE pretraining on synthetic data, combined with distillation, improves classification performance while enabling efficient on-device inference for practical clinical use.
arXiv Detail & Related papers (2026-02-23T13:52:28Z)
Subtyping Breast Lesions via Generative Augmentation based Long-tailed Recognition in Ultrasound [8.410718166932798]
We propose a framework for long-tailed classification that mitigates distributional bias through high-fidelity data synthesis.<n>Our method achieves promising performance compared to state-of-the-art approaches.
arXiv Detail & Related papers (2025-07-30T10:50:41Z)
Attention-Enhanced Deep Learning Ensemble for Breast Density Classification in Mammography [0.0]
This study proposes an automated deep learning system for robust binary classification of breast density.<n>We implemented and compared four advanced convolutional neural networks.<n>We developed a novel Combined Focal Label Smoothing Loss function that integrates focal loss, label smoothing, and class-balanced weighting.
arXiv Detail & Related papers (2025-07-08T21:26:33Z)
Comparison of ConvNeXt and Vision-Language Models for Breast Density Assessment in Screening Mammography [39.58317527488534]
This study compares multimodal and CNN-based methods for automated classification using the BI-RADS system.<n>Zero-shot classification achieved modest performance, while the fine-tuned ConvNeXt model outperformed the BioMedCLIP linear probe.<n>These findings suggest that despite the promise of multimodal learning, CNN-based models with end-to-end fine-tuning provide stronger performance for specialized medical imaging.
arXiv Detail & Related papers (2025-06-16T20:14:37Z)
Prompt-Guided Latent Diffusion with Predictive Class Conditioning for 3D Prostate MRI Generation [1.6508709227918446]
Latent diffusion models (LDM) could alleviate data scarcity challenges affecting machine learning development for medical imaging.<n>We propose a novel LDM conditioning approach to address these limitations.<n>Our method achieves a 3D FID score of 0.025 on a size-limited 3D prostate MRI dataset.
arXiv Detail & Related papers (2025-06-11T23:12:48Z)
Breast Cancer Classification in Deep Ultraviolet Fluorescence Images Using a Patch-Level Vision Transformer Framework [6.0791593833288085]
A deep ultraviolet fluorescence scanning microscope (DUV-FSM) enables rapid acquisition of whole surface images (WSIs) for excised tissue.<n>This study introduces a DUV WSI classification framework using a patch-level vision transformer (ViT) model, capturing local and global features.<n>A comprehensive 5-fold cross-validation demonstrates the proposed approach significantly outperforms conventional deep learning methods, achieving a classification accuracy of 98.33%.
arXiv Detail & Related papers (2025-05-12T15:22:54Z)
Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections [35.585690280385826]
We adapt the Masked Autoencoder for federated learning, enhancing Sharpness-Aware Minimization (FedSAM) and Weight Averaging.<n>Our findings demonstrate that integrating FedSAM into the federated MAE approach improves pretraining, leading to a reduction in reconstruction loss per patch.<n>These findings highlight the potential of federated learning for privacy-preserving training of surgical foundation models.
arXiv Detail & Related papers (2025-04-23T10:54:32Z)
Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging [41.446379453352534]
Latent Diffusion Autoencoder (LDAE) is a novel encoder-decoder diffusion-based framework for efficient and meaningful unsupervised learning in medical imaging.<n>This study focuses on Alzheimer disease (AD) using brain MR from the ADNI database as a case study.
arXiv Detail & Related papers (2025-04-11T15:37:46Z)
Beyond Synthetic Replays: Turning Diffusion Features into Few-Shot Class-Incremental Learning Knowledge [36.22704733553466]
Few-shot class-incremental learning (FSCIL) is challenging due to extremely limited training data.<n>Recent works have explored generative models, particularly Stable Diffusion (SD) to address these challenges.<n>We introduce Diffusion-FSCIL, which extracts four synergistic feature types from SD by capturing real image characteristics.
arXiv Detail & Related papers (2025-03-30T11:20:08Z)
Deep learning for automated detection of breast cancer in deep ultraviolet fluorescence images with diffusion probabilistic model [6.658963545934998]
diffusion probabilistic model (DPM) has shown potential to generate high-quality images. In this paper, we apply DPM to augment the deep ultraviolet fluorescence (DUV) image dataset with an aim to improve breast cancer classification.
arXiv Detail & Related papers (2024-07-01T05:00:26Z)
Semantic Latent Space Regression of Diffusion Autoencoders for Vertebral Fracture Grading [72.45699658852304]
This paper proposes a novel approach to train a generative Diffusion Autoencoder model as an unsupervised feature extractor. We model fracture grading as a continuous regression, which is more reflective of the smooth progression of fractures. Importantly, the generative nature of our method allows us to visualize different grades of a given vertebra, providing interpretability and insight into the features that contribute to automated grading.
arXiv Detail & Related papers (2023-03-21T17:16:01Z)
Successive Subspace Learning for Cardiac Disease Classification with Two-phase Deformation Fields from Cine MRI [36.044984400761535]
This work proposes a lightweight successive subspace learning framework for CVD classification. It is based on an interpretable feedforward design, in conjunction with a cardiac atlas. Compared with 3D CNN-based approaches, our framework achieves superior classification performance with 140$times$ fewer parameters.
arXiv Detail & Related papers (2023-01-21T15:00:59Z)
Boundary Guided Semantic Learning for Real-time COVID-19 Lung Infection Segmentation System [69.40329819373954]
The coronavirus disease 2019 (COVID-19) continues to have a negative impact on healthcare systems around the world. At the current stage, automatically segmenting the lung infection area from CT images is essential for the diagnosis and treatment of COVID-19. We propose a boundary guided semantic learning network (BSNet) in this paper.
arXiv Detail & Related papers (2022-09-07T05:01:38Z)
Deep Implicit Statistical Shape Models for 3D Medical Image Delineation [47.78425002879612]
3D delineation of anatomical structures is a cardinal goal in medical imaging analysis. Prior to deep learning, statistical shape models that imposed anatomical constraints and produced high quality surfaces were a core technology. We present deep implicit statistical shape models (DISSMs), a new approach to delineation that marries the representation power of CNNs with the robustness of SSMs.
arXiv Detail & Related papers (2021-04-07T01:15:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.