JoPano: Unified Panorama Generation via Joint Modeling
- URL: http://arxiv.org/abs/2512.06885v1
- Date: Sun, 07 Dec 2025 15:19:26 GMT
- Title: JoPano: Unified Panorama Generation via Joint Modeling
- Authors: Wancheng Feng, Chen An, Zhenliang He, Meina Kan, Shiguang Shan, Lukun Wang,
- Abstract summary: We propose a joint-face panorama (JoPano) generation approach that unifies the two core tasks within a DiT-based model.<n>We show that JoPano can generate high-quality panoramas for both text-to-panorama and view-to-panorama generation tasks.
- Score: 51.392082596383034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Panorama generation has recently attracted growing interest in the research community, with two core tasks, text-to-panorama and view-to-panorama generation. However, existing methods still face two major challenges: their U-Net-based architectures constrain the visual quality of the generated panoramas, and they usually treat the two core tasks independently, which leads to modeling redundancy and inefficiency. To overcome these challenges, we propose a joint-face panorama (JoPano) generation approach that unifies the two core tasks within a DiT-based model. To transfer the rich generative capabilities of existing DiT backbones learned from natural images to the panorama domain, we propose a Joint-Face Adapter built on the cubemap representation of panoramas, which enables a pretrained DiT to jointly model and generate different views of a panorama. We further apply Poisson Blending to reduce seam inconsistencies that often appear at the boundaries between cube faces. Correspondingly, we introduce Seam-SSIM and Seam-Sobel metrics to quantitatively evaluate the seam consistency. Moreover, we propose a condition switching mechanism that unifies text-to-panorama and view-to-panorama tasks within a single model. Comprehensive experiments show that JoPano can generate high-quality panoramas for both text-to-panorama and view-to-panorama generation tasks, achieving state-of-the-art performance on FID, CLIP-FID, IS, and CLIP-Score metrics.
Related papers
- One Flight Over the Gap: A Survey from Perspective to Panoramic Vision [117.80970697177025]
This survey reviews recent panoramic vision techniques with a particular emphasis on the perspective-to-panorama adaptation.<n>We first revisit the panoramic imaging pipeline and projection methods to build the prior knowledge required for analyzing the structural disparities.<n>Building on this, we cover 20+ representative tasks drawn from more than 300 research papers in two dimensions.
arXiv Detail & Related papers (2025-09-04T17:59:10Z) - What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? [16.01049610453117]
Previous work has demonstrated the feasibility of using conventional low-rank adaptation techniques to generate panoramic images.<n>We introduce a simple framework called UniPano, with the objective of establishing an elegant baseline for future research.
arXiv Detail & Related papers (2025-05-28T08:54:04Z) - Conditional Panoramic Image Generation via Masked Autoregressive Modeling [35.624070746282186]
We propose a unified framework, Panoramic AutoRegressive model (PAR), which leverages masked autoregressive modeling to address these challenges.<n>To address the inherent discontinuity in existing generative models, we introduce circular padding to enhance spatial coherence.<n>Experiments demonstrate competitive performance in text-to-image generation and panorama outpainting tasks.
arXiv Detail & Related papers (2025-05-22T16:20:12Z) - Towards Enhanced Image Generation Via Multi-modal Chain of Thought in Unified Generative Models [52.84391764467939]
Unified generative models have shown remarkable performance in text and image generation.<n>We introduce Chain of Thought (CoT) into unified generative models to address the challenges of complex image generation.<n>Experiments show that FoX consistently outperforms existing unified models on various T2I benchmarks.
arXiv Detail & Related papers (2025-03-03T08:36:16Z) - PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs [10.970010947605289]
Panoramic Image Generation (PIG) aims to create coherent images of arbitrary lengths.<n>We propose PanoLlama, a novel framework that achieves endless and coherent panorama generation with the autoregressive paradigm.
arXiv Detail & Related papers (2024-11-24T15:06:57Z) - DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion [60.45000652592418]
We propose a novel text-driven panoramic generation framework, DiffPano, to achieve scalable, consistent, and diverse panoramic scene generation.
We show that DiffPano can generate consistent, diverse panoramic images with given unseen text descriptions and camera poses.
arXiv Detail & Related papers (2024-10-31T17:57:02Z) - PanoSwin: a Pano-style Swin Transformer for Panorama Understanding [15.115868803355081]
equirectangular projection (ERP) entails boundary discontinuity and spatial distortion.
We propose PanoSwin to learn panorama representations with ERP.
We conduct experiments against the state-of-the-art on various panoramic tasks.
arXiv Detail & Related papers (2023-08-28T17:30:14Z) - HORIZON: High-Resolution Semantically Controlled Panorama Synthesis [105.55531244750019]
Panorama synthesis endeavors to craft captivating 360-degree visual landscapes, immersing users in the heart of virtual worlds.
Recent breakthroughs in visual synthesis have unlocked the potential for semantic control in 2D flat images, but a direct application of these methods to panorama synthesis yields distorted content.
We unveil an innovative framework for generating high-resolution panoramas, adeptly addressing the issues of spherical distortion and edge discontinuity through sophisticated spherical modeling.
arXiv Detail & Related papers (2022-10-10T09:43:26Z) - Cross-View Panorama Image Synthesis [68.35351563852335]
PanoGAN is a novel adversarial feedback GAN framework named.
PanoGAN enables high-quality panorama image generation with more convincing details than state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-22T15:59:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.