Leveraging Pre-trained Models for FF-to-FFPE Histopathological Image Translation
- URL: http://arxiv.org/abs/2406.18054v1
- Date: Wed, 26 Jun 2024 04:12:34 GMT
- Title: Leveraging Pre-trained Models for FF-to-FFPE Histopathological Image Translation
- Authors: Qilai Zhang, Jiawen Li, Peiran Liao, Jiali Hu, Tian Guan, Anjia Han, Yonghong He,
- Abstract summary: Diffusion-FFPE is a method to translate Formalin-Fixed Paraffin-Embedded (FFPE) images into Fresh Frozen (FF) images.
We employ a one-step diffusion model as the generator and fine-tune it with LoRA adapters using adversarial learning objectives.
We conducted FF-to-FFPE translation experiments on the TCGA-NSCLC datasets, and our method achieved better performance compared to other methods.
- Score: 6.108290302640328
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The two primary types of Hematoxylin and Eosin (H&E) slides in histopathology are Formalin-Fixed Paraffin-Embedded (FFPE) and Fresh Frozen (FF). FFPE slides offer high quality histopathological images but require a labor-intensive acquisition process. In contrast, FF slides can be prepared quickly, but the image quality is relatively poor. Our task is to translate FF images into FFPE style, thereby improving the image quality for diagnostic purposes. In this paper, we propose Diffusion-FFPE, a method for FF-to-FFPE histopathological image translation using a pre-trained diffusion model. Specifically, we employ a one-step diffusion model as the generator and fine-tune it with LoRA adapters using adversarial learning objectives. To ensure that the model effectively captures both global structural information and local details, we propose a multi-scale feature fusion (MFF) module. This module utilizes two VAE encoders to extract features of varying image sizes and performs feature fusion before feeding them into the UNet. Furthermore, we utilize a pre-trained vision-language model for histopathology as the backbone for the discriminator to further improve performance We conducted FF-to-FFPE translation experiments on the TCGA-NSCLC datasets, and our method achieved better performance compared to other methods. The code and models are released at https://github.com/QilaiZhang/Diffusion-FFPE.
Related papers
- DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception [66.88792390480343]
We propose DEEM, a simple but effective approach that utilizes the generative feedback of diffusion models to align the semantic distributions of the image encoder.
DEEM exhibits enhanced robustness and a superior capacity to alleviate model hallucinations while utilizing fewer trainable parameters, less pre-training data, and a smaller base model size.
arXiv Detail & Related papers (2024-05-24T05:46:04Z) - F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation [2.435021773579434]
The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery.
FS process often introduces artifacts and distortions like folds and ice-crystal effects.
These artifacts are absent in the higher-quality formalin-fixed paraffin-embedded (FFPE) slides, which require 2-3 days to prepare.
We introduce a novel approach that combines LDMs with Histopathology Pre-Trained Embeddings to enhance restoration of FS images.
arXiv Detail & Related papers (2024-04-19T06:32:21Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - Exploring the Transferability of a Foundation Model for Fundus Images:
Application to Hypertensive Retinopathy [15.643435527710817]
Using deep learning models pre-trained on Imagenet is the traditional solution for medical image classification to deal with data scarcity.
The CGI-HRDC challenge for Hypertensive Retinopathy diagnosis on fundus images introduces an appealing opportunity to evaluate the transferability of a recently released vision-language foundation model of the retina, FLAIR.
arXiv Detail & Related papers (2024-01-27T23:40:24Z) - DiffDis: Empowering Generative Diffusion Model with Cross-Modal
Discrimination Capability [75.9781362556431]
We propose DiffDis to unify the cross-modal generative and discriminative pretraining into one single framework under the diffusion process.
We show that DiffDis outperforms single-task models on both the image generation and the image-text discriminative tasks.
arXiv Detail & Related papers (2023-08-18T05:03:48Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification.
Our generative approach to classification attains strong results on a variety of benchmarks.
Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z) - DiffMIC: Dual-Guidance Diffusion Network for Medical Image
Classification [32.67098520984195]
We propose the first diffusion-based model (named DiffMIC) to address general medical image classification.
Our experimental results demonstrate that DiffMIC outperforms state-of-the-art methods by a significant margin.
arXiv Detail & Related papers (2023-03-19T09:15:45Z) - Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation [41.608617301275935]
We propose a novel end-to-end framework, called Diff-UNet, for medical volumetric segmentation.
Our approach integrates the diffusion model into a standard U-shaped architecture to extract semantic information from the input volume effectively.
We evaluate our method on three datasets, including multimodal brain tumors in MRI, liver tumors, and multi-organ CT volumes.
arXiv Detail & Related papers (2023-03-18T04:06:18Z) - MedSegDiff-V2: Diffusion based Medical Image Segmentation with
Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2.
We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z) - Mutual Contrastive Learning to Disentangle Whole Slide Image
Representations for Glioma Grading [10.65788461379405]
Whole slide images (WSI) provide valuable phenotypic information for histological malignancy assessment and grading of tumors.
The most commonly used WSI are derived from formalin-fixed paraffin-embedded (FFPE) and frozen sections.
Here we propose a mutual contrastive learning scheme to integrate FFPE and frozen sections and disentangle cross-modality representations for glioma grading.
arXiv Detail & Related papers (2022-03-08T11:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.