Related papers: Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

URL: http://arxiv.org/abs/2404.19265v2
Date: Wed, 1 May 2024 00:51:48 GMT
Title: Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation
Authors: Zhenglin Li, Bo Guan, Yuanzhou Wei, Yiming Zhou, Jingyu Zhang, Jinxin Xu,
Abstract summary: This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images. We detail the Pix2Pix model's utilization for generating high-fidelity datasets, supported by a dataset of paired map and aerial images.
Score: 4.767259403145913
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail the Pix2Pix model's utilization for generating high-fidelity datasets, supported by a dataset of paired map and aerial images, and enhanced by a tailored training regimen. The results demonstrate the model's capability to accurately render complex urban features, establishing its efficacy and potential for broad real-world applications.

Related papers

GMAIL: Generative Modality Alignment for generated Image Learning [51.071351994330605]
We propose a novel framework for discriminative use of generated images, coined GMAIL, that explicitly treats generated images as a separate modality from real images.<n>Our framework can be easily incorporated with various vision-language models, and we demonstrate its efficacy throughout extensive experiments.
arXiv Detail & Related papers (2026-02-17T05:40:25Z)
Computer vision training dataset generation for robotic environments using Gaussian splatting [0.0]
This paper introduces a novel pipeline for generating large-scale, highly realistic, and automatically labeled datasets for computer vision tasks in robotic environments.<n>We leverage 3D Gaussian Splatting (3DGS) to create photorealistic representations of the operational environment and objects.<n>A novel, two-pass rendering technique combines the realism of splats with a shadow map generated from proxy meshes.<n> Pixel-perfect segmentation masks are generated automatically and formatted for direct use with object detection models like YOLO.
arXiv Detail & Related papers (2025-12-15T15:00:17Z)
FashionR2R: Texture-preserving Rendered-to-Real Image Translation with Diffusion Models [14.596090302381647]
This paper studies photorealism enhancement of rendered images, leveraging generative power from diffusion models on the controlled basis of rendering. We introduce a novel framework to translate rendered images into their realistic counterparts, which consists of two stages: Domain Knowledge Injection (DKI) and Realistic Image Generation (RIG)
arXiv Detail & Related papers (2024-10-18T12:48:22Z)
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions [66.92809850624118]
PixWizard is an image-to-image visual assistant designed for image generation, manipulation, and translation based on free-from language instructions. We tackle a variety of vision tasks into a unified image-text-to-image generation framework and curate an Omni Pixel-to-Pixel Instruction-Tuning dataset. Our experiments demonstrate that PixWizard not only shows impressive generative and understanding abilities for images with diverse resolutions but also exhibits promising generalization capabilities with unseen tasks and human instructions.
arXiv Detail & Related papers (2024-09-23T17:59:46Z)
MaRINeR: Enhancing Novel Views by Matching Rendered Images with Nearby References [49.71130133080821]
MaRINeR is a refinement method that leverages information of a nearby mapping image to improve the rendering of a target viewpoint. We show improved renderings in quantitative metrics and qualitative examples from both explicit and implicit scene representations.
arXiv Detail & Related papers (2024-07-18T17:50:03Z)
Synthesizing Traffic Datasets using Graph Neural Networks [2.444217495283211]
This paper introduces a novel methodology for bridging this sim-real' gap by creating photorealistic images from 2D traffic simulations and recorded junction footage. We propose a novel image generation approach, integrating a Conditional Generative Adversarial Network with a Graph Neural Network (GNN) to facilitate the creation of realistic urban traffic images.
arXiv Detail & Related papers (2023-12-08T13:24:19Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
Image Inpainting Using Wasserstein Generative Adversarial Imputation Network [0.0]
This paper introduces an image inpainting model based on Wasserstein Generative Adversarial Imputation Network. A universal imputation model is able to handle various scenarios of missingness with sufficient quality.
arXiv Detail & Related papers (2021-06-23T05:55:07Z)
Using GANs to Augment Data for Cloud Image Segmentation Task [2.294014185517203]
We show the effectiveness of using Generative Adversarial Networks (GANs) to generate data to augment the training set. We also present a way to estimate ground-truth binary maps for the GAN-generated images to facilitate their effective use as augmented images.
arXiv Detail & Related papers (2021-06-06T09:01:43Z)
cGANs for Cartoon to Real-life Images [0.4724825031148411]
The project aims to evaluate the robustness of the Pix2Pix model by applying it to datasets consisting of cartoonized images. It should be possible to train the network to generate real-life images from the cartoonized images.
arXiv Detail & Related papers (2021-01-24T20:26:31Z)
Region-adaptive Texture Enhancement for Detailed Person Image Synthesis [86.69934638569815]
RATE-Net is a novel framework for synthesizing person images with sharp texture details. The proposed framework leverages an additional texture enhancing module to extract appearance information from the source image. Experiments conducted on DeepFashion benchmark dataset have demonstrated the superiority of our framework compared with existing networks.
arXiv Detail & Related papers (2020-05-26T02:33:21Z)
Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF. We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials. Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)
Unlimited Resolution Image Generation with R2D2-GANs [69.90258455164513]
We present a novel simulation technique for generating high quality images of any predefined resolution. This method can be used to synthesize sonar scans of size equivalent to those collected during a full-length mission. The data produced is continuous, realistically-looking, and can also be generated at least two times faster than the real speed of acquisition.
arXiv Detail & Related papers (2020-03-02T17:49:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.