A Survey on Text-Driven 360-Degree Panorama Generation
- URL: http://arxiv.org/abs/2502.14799v1
- Date: Thu, 20 Feb 2025 18:19:57 GMT
- Title: A Survey on Text-Driven 360-Degree Panorama Generation
- Authors: Hai Wang, Xiaoyu Xiang, Weihao Xia, Jing-Hao Xue,
- Abstract summary: Text-driven 360-degree panorama generation is a transformative advancement in immersive visual content creation.
Recent progress in text-to-image diffusion models has accelerated the rapid development in this emerging field.
This survey offers an in-depth analysis of state-of-the-art algorithms and their expanding applications in 360-degree 3D scene generation.
- Score: 31.86065545952698
- License:
- Abstract: The advent of text-driven 360-degree panorama generation, enabling the synthesis of 360-degree panoramic images directly from textual descriptions, marks a transformative advancement in immersive visual content creation. This innovation significantly simplifies the traditionally complex process of producing such content. Recent progress in text-to-image diffusion models has accelerated the rapid development in this emerging field. This survey presents a comprehensive review of text-driven 360-degree panorama generation, offering an in-depth analysis of state-of-the-art algorithms and their expanding applications in 360-degree 3D scene generation. Furthermore, we critically examine current limitations and propose promising directions for future research. A curated project page with relevant resources and research papers is available at https://littlewhitesea.github.io/Text-Driven-Pano-Gen/.
Related papers
- TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation [67.45160043297193]
We introduce TextAtlas5M, a novel dataset designed to evaluate long-text rendering in text-conditioned image generation.
Our dataset consists of 5 million long-text generated and collected images across diverse data types.
We further curate 3000 human-improved test set TextAtlasEval across 3 data domains, establishing one of the most extensive benchmarks for text-conditioned generation.
arXiv Detail & Related papers (2025-02-11T18:59:19Z) - DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion [60.45000652592418]
We propose a novel text-driven panoramic generation framework, DiffPano, to achieve scalable, consistent, and diverse panoramic scene generation.
We show that DiffPano can generate consistent, diverse panoramic images with given unseen text descriptions and camera poses.
arXiv Detail & Related papers (2024-10-31T17:57:02Z) - SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting [53.32467009064287]
We propose a text-driven 3D-consistent scene generation model: SceneDreamer360.
Our proposed method leverages a text-driven panoramic image generation model as a prior for 3D scene generation.
Our experiments demonstrate that SceneDreamer360 with its panoramic image generation and 3DGS can produce higher quality, spatially consistent, and visually appealing 3D scenes from any text prompt.
arXiv Detail & Related papers (2024-08-25T02:56:26Z) - Visual Text Generation in the Wild [67.37458807253064]
We propose a visual text generator (termed SceneVTG) which can produce high-quality text images in the wild.
The proposed SceneVTG significantly outperforms traditional rendering-based methods and recent diffusion-based methods in terms of fidelity and reasonability.
The generated images provide superior utility for tasks involving text detection and text recognition.
arXiv Detail & Related papers (2024-07-19T09:08:20Z) - OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting [9.870063736691556]
We tackle the recently popular topic of generating 360-degree images given the conventional narrow field of view (NFoV) images.
This task aims to predict the reasonable and consistent surroundings from the NFoV images.
We propose a novel text-guided out-painting framework equipped with a State-Space Model called Mamba.
arXiv Detail & Related papers (2024-07-15T17:23:00Z) - Taming Stable Diffusion for Text to 360° Panorama Image Generation [74.69314801406763]
We introduce a novel dual-branch diffusion model named PanFusion to generate a 360-degree image from a text prompt.
We propose a unique cross-attention mechanism with projection awareness to minimize distortion during the collaborative denoising process.
arXiv Detail & Related papers (2024-04-11T17:46:14Z) - Autoregressive Omni-Aware Outpainting for Open-Vocabulary 360-Degree Image Generation [36.45222068699805]
AOG-Net is proposed for 360-degree image generation by out-painting an incomplete image progressively with NFoV and text guidances joinly or individually.
A global-local conditioning mechanism is devised to formulate the outpainting guidance in each autoregressive step.
Comprehensive experiments on two commonly used 360-degree image datasets for both indoor and outdoor settings demonstrate the state-of-the-art performance of our proposed method.
arXiv Detail & Related papers (2023-09-07T03:22:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.