Exploring Semantic Consistency in Unpaired Image Translation to Generate
Data for Surgical Applications
- URL: http://arxiv.org/abs/2309.03048v3
- Date: Wed, 21 Feb 2024 13:31:34 GMT
- Title: Exploring Semantic Consistency in Unpaired Image Translation to Generate
Data for Surgical Applications
- Authors: Danush Kumar Venkatesh, Dominik Rivoir, Micha Pfeiffer, Fiona
Kolbinger, Marius Distler, J\"urgen Weitz, Stefanie Speidel
- Abstract summary: This study empirically investigates unpaired image translation methods for generating suitable data in surgical applications.
We find that a simple combination of structural-similarity loss and contrastive learning yields the most promising results.
- Score: 1.8011391924021904
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In surgical computer vision applications, obtaining labeled training data is
challenging due to data-privacy concerns and the need for expert annotation.
Unpaired image-to-image translation techniques have been explored to
automatically generate large annotated datasets by translating synthetic images
to the realistic domain. However, preserving the structure and semantic
consistency between the input and translated images presents significant
challenges, mainly when there is a distributional mismatch in the semantic
characteristics of the domains. This study empirically investigates unpaired
image translation methods for generating suitable data in surgical
applications, explicitly focusing on semantic consistency. We extensively
evaluate various state-of-the-art image translation models on two challenging
surgical datasets and downstream semantic segmentation tasks. We find that a
simple combination of structural-similarity loss and contrastive learning
yields the most promising results. Quantitatively, we show that the data
generated with this approach yields higher semantic consistency and can be used
more effectively as training data.The code is available at
https://gitlab.com/nct_tso_public/constructs.
Related papers
- SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks.
To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z) - Semi-Supervised Image Captioning by Adversarially Propagating Labeled
Data [95.0476489266988]
We present a novel data-efficient semi-supervised framework to improve the generalization of image captioning models.
Our proposed method trains a captioner to learn from a paired data and to progressively associate unpaired data.
Our extensive and comprehensive empirical results both on (1) image-based and (2) dense region-based captioning datasets followed by comprehensive analysis on the scarcely-paired dataset.
arXiv Detail & Related papers (2023-01-26T15:25:43Z) - Positional Contrastive Learning for Volumetric Medical Image
Segmentation [13.086140606803408]
We propose a novel positional contrastive learning framework to generate contrastive data pairs.
The proposed PCL method can substantially improve the segmentation performance compared to existing methods in both semi-supervised setting and transfer learning setting.
arXiv Detail & Related papers (2021-06-16T22:15:28Z) - Content-Preserving Unpaired Translation from Simulated to Realistic
Ultrasound Images [12.136874314973689]
We introduce a novel image translation framework to bridge the appearance gap between simulated images and real scans.
We achieve this goal by leveraging both simulated images with semantic segmentations and unpaired in-vivo ultrasound scans.
arXiv Detail & Related papers (2021-03-09T22:35:43Z) - Image Translation for Medical Image Generation -- Ischemic Stroke
Lesions [0.0]
Synthetic databases with annotated pathologies could provide the required amounts of training data.
We train different image-to-image translation models to synthesize magnetic resonance images of brain volumes with and without stroke lesions.
We show that for a small database of only 10 or 50 clinical cases, synthetic data augmentation yields significant improvement.
arXiv Detail & Related papers (2020-10-05T09:12:28Z) - Semantically Adaptive Image-to-image Translation for Domain Adaptation
of Semantic Segmentation [1.8275108630751844]
We address the problem of domain adaptation for semantic segmentation of street scenes.
Many state-of-the-art approaches focus on translating the source image while imposing that the result should be semantically consistent with the input.
We advocate that the image semantics can also be exploited to guide the translation algorithm.
arXiv Detail & Related papers (2020-09-02T16:16:50Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z) - Towards Unsupervised Learning for Instrument Segmentation in Robotic
Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z) - LC-GAN: Image-to-image Translation Based on Generative Adversarial
Network for Endoscopic Images [22.253074722129053]
We propose an image-to-image translation model live-cadaver GAN (LC-GAN) based on generative adversarial networks (GANs)
For live image segmentation, we first translate the live images to fake-cadaveric images with LC-GAN and then perform segmentation on the fake-cadaveric images with models trained on the real cadaveric dataset.
Our model achieves better image-to-image translation and leads to improved segmentation performance in the proposed cross-domain segmentation task.
arXiv Detail & Related papers (2020-03-10T19:59:25Z) - Grounded and Controllable Image Completion by Incorporating Lexical
Semantics [111.47374576372813]
Lexical Semantic Image Completion (LSIC) may have potential applications in art, design, and heritage conservation.
We advocate generating results faithful to both visual and lexical semantic context.
One major challenge for LSIC comes from modeling and aligning the structure of visual-semantic context.
arXiv Detail & Related papers (2020-02-29T16:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.