Panoramic Image-to-Image Translation
- URL: http://arxiv.org/abs/2304.04960v1
- Date: Tue, 11 Apr 2023 04:08:58 GMT
- Title: Panoramic Image-to-Image Translation
- Authors: Soohyun Kim, Junho Kim, Taekyung Kim, Hwan Heo, Seungryong Kim,
Jiyoung Lee, Jin-Hwa Kim
- Abstract summary: We tackle the challenging task of Panoramic Image-to-Image translation (Pano-I2I) for the first time.
This task is difficult due to the geometric distortion of panoramic images and the lack of a panoramic image dataset with diverse conditions, like weather or time.
We propose a panoramic distortion-aware I2I model that preserves the structure of the panoramic images while consistently translating their global style referenced from a pinhole image.
- Score: 37.9486466936501
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we tackle the challenging task of Panoramic Image-to-Image
translation (Pano-I2I) for the first time. This task is difficult due to the
geometric distortion of panoramic images and the lack of a panoramic image
dataset with diverse conditions, like weather or time. To address these
challenges, we propose a panoramic distortion-aware I2I model that preserves
the structure of the panoramic images while consistently translating their
global style referenced from a pinhole image. To mitigate the distortion issue
in naive 360 panorama translation, we adopt spherical positional embedding to
our transformer encoders, introduce a distortion-free discriminator, and apply
sphere-based rotation for augmentation and its ensemble. We also design a
content encoder and a style encoder to be deformation-aware to deal with a
large domain gap between panoramas and pinhole images, enabling us to work on
diverse conditions of pinhole images. In addition, considering the large
discrepancy between panoramas and pinhole images, our framework decouples the
learning procedure of the panoramic reconstruction stage from the translation
stage. We show distinct improvements over existing I2I models in translating
the StreetLearn dataset in the daytime into diverse conditions. The code will
be publicly available online for our community.
Related papers
- DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion [60.45000652592418]
We propose a novel text-driven panoramic generation framework, DiffPano, to achieve scalable, consistent, and diverse panoramic scene generation.
We show that DiffPano can generate consistent, diverse panoramic images with given unseen text descriptions and camera poses.
arXiv Detail & Related papers (2024-10-31T17:57:02Z) - Multi-source Domain Adaptation for Panoramic Semantic Segmentation [22.367890439050786]
We propose a new task of multi-source domain adaptation for panoramic semantic segmentation.
We aim to utilize both real pinhole synthetic panoramic images in the source domains, enabling the segmentation model to perform well on unlabeled real panoramic images.
DTA4PASS converts all pinhole images in the source domains into panoramic-like images, and then aligns the converted source domains with the target domain.
arXiv Detail & Related papers (2024-08-29T12:00:11Z) - Taming Stable Diffusion for Text to 360° Panorama Image Generation [74.69314801406763]
We introduce a novel dual-branch diffusion model named PanFusion to generate a 360-degree image from a text prompt.
We propose a unique cross-attention mechanism with projection awareness to minimize distortion during the collaborative denoising process.
arXiv Detail & Related papers (2024-04-11T17:46:14Z) - PanoSwin: a Pano-style Swin Transformer for Panorama Understanding [15.115868803355081]
equirectangular projection (ERP) entails boundary discontinuity and spatial distortion.
We propose PanoSwin to learn panorama representations with ERP.
We conduct experiments against the state-of-the-art on various panoramic tasks.
arXiv Detail & Related papers (2023-08-28T17:30:14Z) - PanoGRF: Generalizable Spherical Radiance Fields for Wide-baseline
Panoramas [54.4948540627471]
We propose PanoGRF, Generalizable Spherical Radiance Fields for Wide-baseline Panoramas.
Unlike generalizable radiance fields trained on perspective images, PanoGRF avoids the information loss from panorama-to-perspective conversion.
Results on multiple panoramic datasets demonstrate that PanoGRF significantly outperforms state-of-the-art generalizable view synthesis methods.
arXiv Detail & Related papers (2023-06-02T13:35:07Z) - Local-to-Global Panorama Inpainting for Locale-Aware Indoor Lighting
Prediction [28.180205012351802]
Predicting panoramic indoor lighting from a single perspective image is a fundamental but highly ill-posed problem in computer vision and graphics.
Recent methods mostly rely on convolutional neural networks (CNNs) to fill the missing contents in the warped panorama.
We propose a local-to-global strategy for large-scale panorama inpainting.
arXiv Detail & Related papers (2023-03-18T06:18:49Z) - PanoViT: Vision Transformer for Room Layout Estimation from a Single
Panoramic Image [11.053777620735175]
PanoViT is a panorama vision transformer to estimate the room layout from a single panoramic image.
Compared to CNN models, our PanoViT is more proficient in learning global information from the panoramic image.
Our method outperforms state-of-the-art solutions in room layout prediction accuracy.
arXiv Detail & Related papers (2022-12-23T05:37:11Z) - Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation [73.48323921632506]
We address panoramic semantic segmentation which is under-explored due to two critical challenges.
First, we propose an upgraded Transformer for Panoramic Semantic, i.e., Trans4PASS+, equipped with Deformable Patch Embedding (DPE) and Deformable (DMLPv2) modules.
Second, we enhance the Mutual Prototypical Adaptation (MPA) strategy via pseudo-label rectification for unsupervised domain adaptive panoramic segmentation.
Third, aside from Pinhole-to-Panoramic (Pin2Pan) adaptation, we create a new dataset (SynPASS) with 9,080 panoramic images
arXiv Detail & Related papers (2022-07-25T00:42:38Z) - Cross-View Panorama Image Synthesis [68.35351563852335]
PanoGAN is a novel adversarial feedback GAN framework named.
PanoGAN enables high-quality panorama image generation with more convincing details than state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-22T15:59:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.