Text-guided High-definition Consistency Texture Model
- URL: http://arxiv.org/abs/2305.05901v1
- Date: Wed, 10 May 2023 05:09:05 GMT
- Title: Text-guided High-definition Consistency Texture Model
- Authors: Zhibin Tang, Tiantong He
- Abstract summary: We present the High-definition Consistency Texture Model (HCTM), a novel method that can generate high-definition textures for 3D meshes according to the text prompts.
We achieve this by leveraging a pre-trained depth-to-image diffusion model to generate single viewpoint results based on the text prompt and a depth map.
Our proposed approach has demonstrated promising results in generating high-definition and consistent textures for 3D meshes.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the advent of depth-to-image diffusion models, text-guided generation,
editing, and transfer of realistic textures are no longer difficult. However,
due to the limitations of pre-trained diffusion models, they can only create
low-resolution, inconsistent textures. To address this issue, we present the
High-definition Consistency Texture Model (HCTM), a novel method that can
generate high-definition and consistent textures for 3D meshes according to the
text prompts. We achieve this by leveraging a pre-trained depth-to-image
diffusion model to generate single viewpoint results based on the text prompt
and a depth map. We fine-tune the diffusion model with Parameter-Efficient
Fine-Tuning to quickly learn the style of the generated result, and apply the
multi-diffusion strategy to produce high-resolution and consistent results from
different viewpoints. Furthermore, we propose a strategy that prevents the
appearance of noise on the textures caused by backpropagation. Our proposed
approach has demonstrated promising results in generating high-definition and
consistent textures for 3D meshes, as demonstrated through a series of
experiments.
Related papers
- TEXGen: a Generative Diffusion Model for Mesh Textures [63.43159148394021]
We focus on the fundamental problem of learning in the UV texture space itself.
We propose a scalable network architecture that interleaves convolutions on UV maps with attention layers on point clouds.
We train a 700 million parameter diffusion model that can generate UV texture maps guided by text prompts and single-view images.
arXiv Detail & Related papers (2024-11-22T05:22:11Z) - RoCoTex: A Robust Method for Consistent Texture Synthesis with Diffusion Models [3.714901836138171]
We propose a robust text-to-texture method for generating consistent and seamless textures that are well aligned with the mesh.
Our method leverages state-of-the-art 2D diffusion models, including SDXL and multiple ControlNets, to capture structural features and intricate details in the generated textures.
arXiv Detail & Related papers (2024-09-30T06:29:50Z) - GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation [35.04723374116026]
Large-scale text-to-image (T2I) models have shown astonishing results in text-to-image (T2I) generation.
Applying these models to synthesize textures for 3D geometries remains challenging due to the domain gap between 2D images and textures on a 3D surface.
We propose a novel text-to-texture synthesis framework that leverages pretrained diffusion models.
arXiv Detail & Related papers (2024-09-27T02:32:42Z) - TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling [37.67373829836975]
We present TexGen, a novel multi-view sampling and resampling framework for texture generation.
Our proposed method produces significantly better texture quality for diverse 3D objects with a high degree of view consistency.
Our proposed texture generation technique can also be applied to texture editing while preserving the original identity.
arXiv Detail & Related papers (2024-08-02T14:24:40Z) - Infinite Texture: Text-guided High Resolution Diffusion Texture Synthesis [61.189479577198846]
We present Infinite Texture, a method for generating arbitrarily large texture images from a text prompt.
Our approach fine-tunes a diffusion model on a single texture, and learns to embed that statistical distribution in the output domain of the model.
At generation time, our fine-tuned diffusion model is used through a score aggregation strategy to generate output texture images of arbitrary resolution on a single GPU.
arXiv Detail & Related papers (2024-05-13T21:53:09Z) - Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model [65.58911408026748]
We propose Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts.
We first advocate leveraging text-guided 4-view images as the bottleneck in the text-to-3D pipeline.
We then introduce an attention refocusing mechanism to encourage text-aligned 4-view image generation.
arXiv Detail & Related papers (2024-04-28T04:05:10Z) - EucliDreamer: Fast and High-Quality Texturing for 3D Models with Depth-Conditioned Stable Diffusion [5.158983929861116]
We present EucliDreamer, a simple and effective method to generate textures for 3D models given text and prompts.
The texture is parametized as an implicit function on the 3D surface, which is optimized with the Score Distillation Sampling (SDS) process and differentiable rendering.
arXiv Detail & Related papers (2024-04-16T04:44:16Z) - TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion
Models [77.85129451435704]
We present a new method to synthesize textures for 3D, using large-scale-guided image diffusion models.
Specifically, we leverage latent diffusion models, apply the set denoising model and aggregate denoising text map.
arXiv Detail & Related papers (2023-10-20T19:15:29Z) - PaintHuman: Towards High-fidelity Text-to-3D Human Texturing via
Denoised Score Distillation [89.09455618184239]
Recent advances in text-to-3D human generation have been groundbreaking.
We propose a model called PaintHuman to address the challenges from two aspects.
We use the depth map as a guidance to ensure realistic semantically aligned textures.
arXiv Detail & Related papers (2023-10-14T00:37:16Z) - IT3D: Improved Text-to-3D Generation with Explicit View Synthesis [71.68595192524843]
This study presents a novel strategy that leverages explicitly synthesized multi-view images to address these issues.
Our approach involves the utilization of image-to-image pipelines, empowered by LDMs, to generate posed high-quality images.
For the incorporated discriminator, the synthesized multi-view images are considered real data, while the renderings of the optimized 3D models function as fake data.
arXiv Detail & Related papers (2023-08-22T14:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.