Translating Simulation Images to X-ray Images via Multi-Scale Semantic
Matching
- URL: http://arxiv.org/abs/2304.07693v1
- Date: Sun, 16 Apr 2023 04:49:46 GMT
- Title: Translating Simulation Images to X-ray Images via Multi-Scale Semantic
Matching
- Authors: Jingxuan Kang, Tudor Jianu, Baoru Huang, Binod Bhattarai, Ngan Le,
Frans Coenen, and Anh Nguyen
- Abstract summary: We propose a new method to translate simulation images from an endovascular simulator to X-ray images.
We apply self-domain semantic matching to ensure that the input image and the generated image have the same positional semantic relationships.
Our method generates realistic X-ray images and outperforms other state-of-the-art approaches by a large margin.
- Score: 16.175115921436582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Endovascular intervention training is increasingly being conducted in virtual
simulators. However, transferring the experience from endovascular simulators
to the real world remains an open problem. The key challenge is the virtual
environments are usually not realistically simulated, especially the simulation
images. In this paper, we propose a new method to translate simulation images
from an endovascular simulator to X-ray images. Previous image-to-image
translation methods often focus on visual effects and neglect structure
information, which is critical for medical images. To address this gap, we
propose a new method that utilizes multi-scale semantic matching. We apply
self-domain semantic matching to ensure that the input image and the generated
image have the same positional semantic relationships. We further apply
cross-domain matching to eliminate the effects of different styles. The
intensive experiment shows that our method generates realistic X-ray images and
outperforms other state-of-the-art approaches by a large margin. We also
collect a new large-scale dataset to serve as the new benchmark for this task.
Our source code and dataset will be made publicly available.
Related papers
- Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects.
In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL)
A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - AptSim2Real: Approximately-Paired Sim-to-Real Image Translation [8.208569626646035]
Sim-to-real transfer modifies simulated images to better match real-world data.
AptSim2Real exploits the fact that simulators can generate scenes loosely resembling real-world scenes in terms of lighting, environment, and composition.
Our novel training strategy results in significant qualitative and quantitative improvements, with up to a 24% improvement in FID score.
arXiv Detail & Related papers (2023-03-09T06:18:44Z) - Unpaired Translation from Semantic Label Maps to Images by Leveraging
Domain-Specific Simulations [11.638139969660266]
We introduce a contrastive learning framework for generating photorealistic images from simulated label maps.
Our proposed method is shown to generate realistic and scene-accurate translations.
arXiv Detail & Related papers (2023-02-21T14:36:18Z) - Shape Reconstruction from Thoracoscopic Images using Self-supervised
Virtual Learning [2.4493299476776778]
Intraoperative shape reconstruction of organs from endoscopic camera images is a complex yet indispensable technique for image-guided surgery.
We propose a framework for generative virtual learning of shape reconstruction using image translation with common latent variables between simulated and real images.
In this study, we targeted the shape reconstruction of collapsed lungs from thoracoscopic images and confirmed that virtual learning could improve the similarity between real and simulated images.
arXiv Detail & Related papers (2023-01-25T23:08:41Z) - A Shared Representation for Photorealistic Driving Simulators [83.5985178314263]
We propose to improve the quality of generated images by rethinking the discriminator architecture.
The focus is on the class of problems where images are generated given semantic inputs, such as scene segmentation maps or human body poses.
We aim to learn a shared latent representation that encodes enough information to jointly do semantic segmentation, content reconstruction, along with a coarse-to-fine grained adversarial reasoning.
arXiv Detail & Related papers (2021-12-09T18:59:21Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - Long-Term Temporally Consistent Unpaired Video Translation from
Simulated Surgical 3D Data [0.059110875077162096]
We propose a novel approach which combines unpaired image translation with neural rendering to transfer simulated to photorealistic surgical abdominal scenes.
By introducing global learnable textures and a lighting-invariant view-consistency loss, our method produces consistent translations of arbitrary views.
By extending existing image-based methods to view-consistent videos, we aim to impact the applicability of simulated training and evaluation environments for surgical applications.
arXiv Detail & Related papers (2021-03-31T16:31:26Z) - Learning Ultrasound Rendering from Cross-Sectional Model Slices for
Simulated Training [13.640630434743837]
Computational simulations can facilitate the training of such skills in virtual reality.
We propose herein to bypass any rendering and simulation process at interactive time.
We use a generative adversarial framework with a dedicated generator architecture and input feeding scheme.
arXiv Detail & Related papers (2021-01-20T21:58:19Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z) - Self-Supervised Linear Motion Deblurring [112.75317069916579]
Deep convolutional neural networks are state-of-the-art for image deblurring.
We present a differentiable reblur model for self-supervised motion deblurring.
Our experiments demonstrate that self-supervised single image deblurring is really feasible.
arXiv Detail & Related papers (2020-02-10T20:15:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.