Related papers: InsTex: Indoor Scenes Stylized Texture Synthesis

InsTex: Indoor Scenes Stylized Texture Synthesis

URL: http://arxiv.org/abs/2501.13969v1
Date: Wed, 22 Jan 2025 08:37:59 GMT
Title: InsTex: Indoor Scenes Stylized Texture Synthesis
Authors: Yunfan Zhang, Zhiwei Xiong, Zhiqi Shen, Guosheng Lin, Hao Wang, Nicolas Vun,
Abstract summary: High-quality textures are crucial for 3D scenes for augmented/virtual reality (ARVR) applications.<n>Current methods suffer from lengthy processing times and visual artifacts.<n>We introduce two-stage architecture designed to generate high-quality textures for 3D scenes.
Score: 81.12010726769768
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Generating high-quality textures for 3D scenes is crucial for applications in interior design, gaming, and augmented/virtual reality (AR/VR). Although recent advancements in 3D generative models have enhanced content creation, significant challenges remain in achieving broad generalization and maintaining style consistency across multiple viewpoints. Current methods, such as 2D diffusion models adapted for 3D texturing, suffer from lengthy processing times and visual artifacts, while approaches driven by 3D data often fail to generalize effectively. To overcome these challenges, we introduce InsTex, a two-stage architecture designed to generate high-quality, style-consistent textures for 3D indoor scenes. InsTex utilizes depth-to-image diffusion priors in a coarse-to-fine pipeline, first generating multi-view images with a pre-trained 2D diffusion model and subsequently refining the textures for consistency. Our method supports both textual and visual prompts, achieving state-of-the-art results in visual quality and quantitative metrics, and demonstrates its effectiveness across various 3D texturing applications.

Related papers

RomanTex: Decoupling 3D-aware Rotary Positional Embedded Multi-Attention Network for Texture Synthesis [10.350576861948952]
RomanTex is a multiview-based texture generation framework that integrates a multi-attention network with an underlying 3D representation. Our method achieves state-of-the-art results in texture quality and consistency.
arXiv Detail & Related papers (2025-03-24T17:56:11Z)
HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions [31.342899807980654]
3D scene generation is in high demand across various domains, including virtual reality, gaming, and the film industry. We introduce HoloDreamer, a framework that first generates high-definition panorama as a holistic initialization of the full 3D scene. We then leverage 3D Gaussian Splatting (3D-GS) to quickly reconstruct the 3D scene, thereby facilitating the creation of view-consistent and fully enclosed 3D scenes.
arXiv Detail & Related papers (2024-07-21T14:52:51Z)
EucliDreamer: Fast and High-Quality Texturing for 3D Models with Depth-Conditioned Stable Diffusion [5.158983929861116]
We present EucliDreamer, a simple and effective method to generate textures for 3D models given text and prompts. The texture is parametized as an implicit function on the 3D surface, which is optimized with the Score Distillation Sampling (SDS) process and differentiable rendering.
arXiv Detail & Related papers (2024-04-16T04:44:16Z)
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene. Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z)
ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data. We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z)
TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models [77.85129451435704]
We present a new method to synthesize textures for 3D, using large-scale-guided image diffusion models. Specifically, we leverage latent diffusion models, apply the set denoising model and aggregate denoising text map.
arXiv Detail & Related papers (2023-10-20T19:15:29Z)
Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models. Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z)
Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior [36.40582157854088]
In this work, we investigate the problem of creating high-fidelity 3D content from only a single image. We leverage prior knowledge from a well-trained 2D diffusion model to act as 3D-aware supervision for 3D creation. Our method presents the first attempt to achieve high-quality 3D creation from a single image for general objects and enables various applications such as text-to-3D creation and texture editing.
arXiv Detail & Related papers (2023-03-24T17:54:22Z)
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images [72.15855070133425]
We introduce GET3D, a Generative model that directly generates Explicit Textured 3D meshes with complex topology, rich geometric details, and high-fidelity textures. GET3D is able to generate high-quality 3D textured meshes, ranging from cars, chairs, animals, motorbikes and human characters to buildings.
arXiv Detail & Related papers (2022-09-22T17:16:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.