Data Augmentation for Surgical Scene Segmentation with Anatomy-Aware Diffusion Models
- URL: http://arxiv.org/abs/2410.07753v2
- Date: Thu, 21 Nov 2024 12:47:53 GMT
- Title: Data Augmentation for Surgical Scene Segmentation with Anatomy-Aware Diffusion Models
- Authors: Danush Kumar Venkatesh, Dominik Rivoir, Micha Pfeiffer, Fiona Kolbinger, Stefanie Speidel,
- Abstract summary: We introduce a multi-stage approach to generate multi-class surgical datasets with annotations.
Our framework improves anatomy awareness by training organ specific models with an inpainting objective guided by binary segmentation masks.
This versatile approach allows the generation of multi-class datasets from real binary datasets and simulated surgical masks.
- Score: 1.9085155846692308
- License:
- Abstract: In computer-assisted surgery, automatically recognizing anatomical organs is crucial for understanding the surgical scene and providing intraoperative assistance. While machine learning models can identify such structures, their deployment is hindered by the need for labeled, diverse surgical datasets with anatomical annotations. Labeling multiple classes (i.e., organs) in a surgical scene is time-intensive, requiring medical experts. Although synthetically generated images can enhance segmentation performance, maintaining both organ structure and texture during generation is challenging. We introduce a multi-stage approach using diffusion models to generate multi-class surgical datasets with annotations. Our framework improves anatomy awareness by training organ specific models with an inpainting objective guided by binary segmentation masks. The organs are generated with an inference pipeline using pre-trained ControlNet to maintain the organ structure. The synthetic multi-class datasets are constructed through an image composition step, ensuring structural and textural consistency. This versatile approach allows the generation of multi-class datasets from real binary datasets and simulated surgical masks. We thoroughly evaluate the generated datasets on image quality and downstream segmentation, achieving a $15\%$ improvement in segmentation scores when combined with real images. The code is available at https://gitlab.com/nct_tso_public/muli-class-image-synthesis
Related papers
- Diffusion-based Data Augmentation for Nuclei Image Segmentation [68.28350341833526]
We introduce the first diffusion-based augmentation method for nuclei segmentation.
The idea is to synthesize a large number of labeled images to facilitate training the segmentation model.
The experimental results show that by augmenting 10% labeled real dataset with synthetic samples, one can achieve comparable segmentation results.
arXiv Detail & Related papers (2023-10-22T06:16:16Z) - Generalizing Surgical Instruments Segmentation to Unseen Domains with
One-to-Many Synthesis [18.830738606514736]
Deep learning methods are frequently hindered from deploying to real-world surgical applications.
Data collection, annotation, and domain shift in-between sites and patients are the most common obstacles.
We mitigate data-related issues by efficiently leveraging minimal source images to generate synthetic surgical instrument segmentation datasets.
arXiv Detail & Related papers (2023-06-28T15:06:44Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - Learning Incrementally to Segment Multiple Organs in a CT Image [11.082692639365982]
We propose to incrementally learn a multi-organ segmentation model.
In each incremental learning stage, we lose the access to previous data and annotations.
We experimentally discover that such a weakness mostly disappears for CT multi-organ segmentation.
arXiv Detail & Related papers (2022-03-04T02:32:04Z) - Generalized Organ Segmentation by Imitating One-shot Reasoning using
Anatomical Correlation [55.1248480381153]
We propose OrganNet which learns a generalized organ concept from a set of annotated organ classes and then transfer this concept to unseen classes.
We show that OrganNet can effectively resist the wide variations in organ morphology and produce state-of-the-art results in one-shot segmentation task.
arXiv Detail & Related papers (2021-03-30T13:41:12Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z) - m2caiSeg: Semantic Segmentation of Laparoscopic Images using
Convolutional Neural Networks [4.926395463398194]
We propose a deep learning based semantic segmentation algorithm to identify and label the tissues and organs in the endoscopic video feed of the human torso region.
We present an annotated dataset, m2caiSeg, created from endoscopic video feeds of real-world surgical procedures.
arXiv Detail & Related papers (2020-08-23T23:30:15Z) - Synthetic and Real Inputs for Tool Segmentation in Robotic Surgery [10.562627972607892]
We show that it may be possible to use robot kinematic data coupled with laparoscopic images to alleviate the labelling problem.
We propose a new deep learning based model for parallel processing of both laparoscopic and simulation images.
arXiv Detail & Related papers (2020-07-17T16:33:33Z) - Retinal Image Segmentation with a Structure-Texture Demixing Network [62.69128827622726]
The complex structure and texture information are mixed in a retinal image, and distinguishing the information is difficult.
Existing methods handle texture and structure jointly, which may lead biased models toward recognizing textures and thus results in inferior segmentation performance.
We propose a segmentation strategy that seeks to separate structure and texture components and significantly improve the performance.
arXiv Detail & Related papers (2020-07-15T12:19:03Z) - Towards Unsupervised Learning for Instrument Segmentation in Robotic
Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z) - LC-GAN: Image-to-image Translation Based on Generative Adversarial
Network for Endoscopic Images [22.253074722129053]
We propose an image-to-image translation model live-cadaver GAN (LC-GAN) based on generative adversarial networks (GANs)
For live image segmentation, we first translate the live images to fake-cadaveric images with LC-GAN and then perform segmentation on the fake-cadaveric images with models trained on the real cadaveric dataset.
Our model achieves better image-to-image translation and leads to improved segmentation performance in the proposed cross-domain segmentation task.
arXiv Detail & Related papers (2020-03-10T19:59:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.