Unpaired Translation from Semantic Label Maps to Images by Leveraging
Domain-Specific Simulations
- URL: http://arxiv.org/abs/2302.10698v1
- Date: Tue, 21 Feb 2023 14:36:18 GMT
- Title: Unpaired Translation from Semantic Label Maps to Images by Leveraging
Domain-Specific Simulations
- Authors: Lin Zhang, Tiziano Portenier, Orcun Goksel
- Abstract summary: We introduce a contrastive learning framework for generating photorealistic images from simulated label maps.
Our proposed method is shown to generate realistic and scene-accurate translations.
- Score: 11.638139969660266
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Photorealistic image generation from simulated label maps are necessitated in
several contexts, such as for medical training in virtual reality. With
conventional deep learning methods, this task requires images that are paired
with semantic annotations, which typically are unavailable. We introduce a
contrastive learning framework for generating photorealistic images from
simulated label maps, by learning from unpaired sets of both. Due to
potentially large scene differences between real images and label maps,
existing unpaired image translation methods lead to artifacts of scene
modification in synthesized images. We utilize simulated images as surrogate
targets for a contrastive loss, while ensuring consistency by utilizing
features from a reverse translation network. Our method enables bidirectional
label-image translations, which is demonstrated in a variety of scenarios and
datasets, including laparoscopy, ultrasound, and driving scenes. By comparing
with state-of-the-art unpaired translation methods, our proposed method is
shown to generate realistic and scene-accurate translations.
Related papers
- VIXEN: Visual Text Comparison Network for Image Difference Captioning [58.16313862434814]
We present VIXEN, a technique that succinctly summarizes in text the visual differences between a pair of images.
Our proposed network linearly maps image features in a pairwise manner, constructing a soft prompt for a pretrained large language model.
arXiv Detail & Related papers (2024-02-29T12:56:18Z) - Exploring Semantic Consistency in Unpaired Image Translation to Generate
Data for Surgical Applications [1.8011391924021904]
This study empirically investigates unpaired image translation methods for generating suitable data in surgical applications.
We find that a simple combination of structural-similarity loss and contrastive learning yields the most promising results.
arXiv Detail & Related papers (2023-09-06T14:43:22Z) - Translating Simulation Images to X-ray Images via Multi-Scale Semantic
Matching [16.175115921436582]
We propose a new method to translate simulation images from an endovascular simulator to X-ray images.
We apply self-domain semantic matching to ensure that the input image and the generated image have the same positional semantic relationships.
Our method generates realistic X-ray images and outperforms other state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2023-04-16T04:49:46Z) - AptSim2Real: Approximately-Paired Sim-to-Real Image Translation [8.208569626646035]
Sim-to-real transfer modifies simulated images to better match real-world data.
AptSim2Real exploits the fact that simulators can generate scenes loosely resembling real-world scenes in terms of lighting, environment, and composition.
Our novel training strategy results in significant qualitative and quantitative improvements, with up to a 24% improvement in FID score.
arXiv Detail & Related papers (2023-03-09T06:18:44Z) - Image-to-Image Translation for Autonomous Driving from Coarsely-Aligned
Image Pairs [57.33431586417377]
A self-driving car must be able to handle adverse weather conditions to operate safely.
In this paper, we investigate the idea of turning sensor inputs captured in an adverse condition into a benign one.
We show that our coarsely-aligned training scheme leads to a better image translation quality and improved downstream tasks.
arXiv Detail & Related papers (2022-09-23T16:03:18Z) - More Control for Free! Image Synthesis with Semantic Diffusion Guidance [79.88929906247695]
Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from an example image.
We introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both.
We conduct experiments on FFHQ and LSUN datasets, and show results on fine-grained text-guided image synthesis.
arXiv Detail & Related papers (2021-12-10T18:55:50Z) - USIS: Unsupervised Semantic Image Synthesis [9.613134538472801]
We propose a new Unsupervised paradigm for Semantic Image Synthesis (USIS)
USIS learns to output images with visually separable semantic classes using a self-supervised segmentation loss.
In order to match the color and texture distribution of real images without losing high-frequency information, we propose to use whole image wavelet-based discrimination.
arXiv Detail & Related papers (2021-09-29T20:48:41Z) - Content-Preserving Unpaired Translation from Simulated to Realistic
Ultrasound Images [12.136874314973689]
We introduce a novel image translation framework to bridge the appearance gap between simulated images and real scans.
We achieve this goal by leveraging both simulated images with semantic segmentations and unpaired in-vivo ultrasound scans.
arXiv Detail & Related papers (2021-03-09T22:35:43Z) - Controllable Image Synthesis via SegVAE [89.04391680233493]
A semantic map is commonly used intermediate representation for conditional image generation.
In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder.
arXiv Detail & Related papers (2020-07-16T15:18:53Z) - Domain Adaptation for Image Dehazing [72.15994735131835]
Most existing methods train a dehazing model on synthetic hazy images, which are less able to generalize well to real hazy images due to domain shift.
We propose a domain adaptation paradigm, which consists of an image translation module and two image dehazing modules.
Experimental results on both synthetic and real-world images demonstrate that our model performs favorably against the state-of-the-art dehazing algorithms.
arXiv Detail & Related papers (2020-05-10T13:54:56Z) - Grounded and Controllable Image Completion by Incorporating Lexical
Semantics [111.47374576372813]
Lexical Semantic Image Completion (LSIC) may have potential applications in art, design, and heritage conservation.
We advocate generating results faithful to both visual and lexical semantic context.
One major challenge for LSIC comes from modeling and aligning the structure of visual-semantic context.
arXiv Detail & Related papers (2020-02-29T16:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.