Related papers: SurgicaL-CD: Generating Surgical Images via Unpaired Image Translation with Latent Consistency Diffusion Models

SurgicaL-CD: Generating Surgical Images via Unpaired Image Translation with Latent Consistency Diffusion Models

URL: http://arxiv.org/abs/2408.09822v3
Date: Fri, 11 Oct 2024 07:46:11 GMT
Title: SurgicaL-CD: Generating Surgical Images via Unpaired Image Translation with Latent Consistency Diffusion Models
Authors: Danush Kumar Venkatesh, Dominik Rivoir, Micha Pfeiffer, Stefanie Speidel,
Abstract summary: We introduce emphSurgicaL-CD, a consistency-distilled diffusion method to generate realistic surgical images. Our results demonstrate that our method outperforms GANs and diffusion-based approaches.
Score: 1.6189876649941652
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Computer-assisted surgery (CAS) systems are designed to assist surgeons during procedures, thereby reducing complications and enhancing patient care. Training machine learning models for these systems requires a large corpus of annotated datasets, which is challenging to obtain in the surgical domain due to patient privacy concerns and the significant labeling effort required from doctors. Previous methods have explored unpaired image translation using generative models to create realistic surgical images from simulations. However, these approaches have struggled to produce high-quality, diverse surgical images. In this work, we introduce \emph{SurgicaL-CD}, a consistency-distilled diffusion method to generate realistic surgical images with only a few sampling steps without paired data. We evaluate our approach on three datasets, assessing the generated images in terms of quality and utility as downstream training datasets. Our results demonstrate that our method outperforms GANs and diffusion-based approaches. Our code is available at https://gitlab.com/nct_tso_public/gan2diffusion.

Related papers

SSDD-GAN: Single-Step Denoising Diffusion GAN for Cochlear Implant Surgical Scene Completion [4.250558597144547]
We propose an efficient method to complete the surgical scene of the synthetic postmastoidectomy dataset. Our approach leverages self-supervised learning on real surgical datasets to train a Single-Step Denoising Diffusion-GAN (SSDD-GAN) The trained model is then directly applied to the synthetic postmastoidectomy dataset using a zero-shot approach.
arXiv Detail & Related papers (2025-02-08T22:04:22Z)
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation. We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z)
DAug: Diffusion-based Channel Augmentation for Radiology Image Retrieval and Classification [24.68697717585541]
We propose a portable method that improves a perception model's performance with a generative model's output. Specifically, we extend a radiology image to multiple channels, with the additional channels being the heatmaps of regions where diseases tend to develop. Our method is motivated by the fact that generative models learn the distribution of normal and abnormal images, and such knowledge is complementary to image understanding tasks.
arXiv Detail & Related papers (2024-12-06T07:43:28Z)
Surgical Text-to-Image Generation [1.958913666074613]
We adapt text-to-image generative models for the surgical domain using the CholecT50 dataset. We develop Surgical Imagen to generate photorealistic and activity-aligned surgical images from triplet-based textual prompts.
arXiv Detail & Related papers (2024-07-12T12:49:11Z)
CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images. The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism. We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z)
Navigating the Synthetic Realm: Harnessing Diffusion-based Models for Laparoscopic Text-to-Image Generation [3.2039076408339353]
We present an intuitive approach for generating synthetic laparoscopic images from short text prompts using diffusion-based generative models. Results on fidelity and diversity demonstrate that diffusion-based models can acquire knowledge about the style and semantics in the field of image-guided surgery.
arXiv Detail & Related papers (2023-12-05T16:20:22Z)
Rethinking Surgical Instrument Segmentation: A Background Image Can Be All You Need [18.830738606514736]
Data scarcity and imbalance have heavily affected the model accuracy and limited the design and deployment of deep learning-based surgical applications. We propose a one-to-many data generation solution that gets rid of the complicated and expensive process of data collection and annotation from robotic surgery. Our empirical analysis suggests that without the high cost of data collection and annotation, we can achieve decent surgical instrument segmentation performance.
arXiv Detail & Related papers (2022-06-23T16:22:56Z)
Harmonizing Pathological and Normal Pixels for Pseudo-healthy Synthesis [68.5287824124996]
We present a new type of discriminator, the segmentor, to accurately locate the lesions and improve the visual quality of pseudo-healthy images. We apply the generated images into medical image enhancement and utilize the enhanced results to cope with the low contrast problem. Comprehensive experiments on the T2 modality of BraTS demonstrate that the proposed method substantially outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T08:41:17Z)
Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical Procedures [70.69948035469467]
We take advantage of the latest computer vision methodologies for generating 3D graphs from camera views. We then introduce the Multimodal Semantic Graph Scene (MSSG) which aims at providing unified symbolic and semantic representation of surgical procedures.
arXiv Detail & Related papers (2021-06-09T14:35:44Z)
Pathology-Aware Generative Adversarial Networks for Medical Image Augmentation [0.22843885788439805]
Generative Adversarial Networks (GANs) can generate realistic but novel samples, and thus effectively cover the real image distribution. This thesis contains four GAN projects aiming to present such novel applications' clinical relevance in collaboration with physicians.
arXiv Detail & Related papers (2021-06-03T15:08:14Z)
A Multi-Stage Attentive Transfer Learning Framework for Improving COVID-19 Diagnosis [49.3704402041314]
We propose a multi-stage attentive transfer learning framework for improving COVID-19 diagnosis. Our proposed framework consists of three stages to train accurate diagnosis models through learning knowledge from multiple source tasks and data of different domains. Importantly, we propose a novel self-supervised learning method to learn multi-scale representations for lung CT images.
arXiv Detail & Related papers (2021-01-14T01:39:19Z)
Generative Adversarial U-Net for Domain-free Medical Image Augmentation [49.72048151146307]
The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing. In this paper, we develop a novel generative method named generative adversarial U-Net. Our newly designed model is domain-free and generalizable to various medical images.
arXiv Detail & Related papers (2021-01-12T23:02:26Z)
Towards Unsupervised Learning for Instrument Segmentation in Robotic Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation. Our approach allows to train image segmentation models without the need to acquire expensive annotations. We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.