Dream-in-Style: Text-to-3D Generation using Stylized Score Distillation
- URL: http://arxiv.org/abs/2406.18581v1
- Date: Wed, 5 Jun 2024 16:27:34 GMT
- Title: Dream-in-Style: Text-to-3D Generation using Stylized Score Distillation
- Authors: Hubert Kompanowski, Binh-Son Hua,
- Abstract summary: We present a method to generate 3D objects in styles.
Our method takes a text prompt and a style reference image as input and reconstructs a neural radiance field to synthesize a 3D model.
- Score: 14.079043195485601
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a method to generate 3D objects in styles. Our method takes a text prompt and a style reference image as input and reconstructs a neural radiance field to synthesize a 3D model with the content aligning with the text prompt and the style following the reference image. To simultaneously generate the 3D object and perform style transfer in one go, we propose a stylized score distillation loss to guide a text-to-3D optimization process to output visually plausible geometry and appearance. Our stylized score distillation is based on a combination of an original pretrained text-to-image model and its modified sibling with the key and value features of self-attention layers manipulated to inject styles from the reference image. Comparisons with state-of-the-art methods demonstrated the strong visual performance of our method, further supported by the quantitative results from our user study.
Related papers
- StyleTex: Style Image-Guided Texture Generation for 3D Models [8.764938886974482]
Style-guided texture generation aims to generate a texture that is harmonious with both the style of the reference image and the geometry of the input mesh.
We introduce StyleTex, an innovative diffusion-model-based framework for creating stylized textures for 3D models.
arXiv Detail & Related papers (2024-11-01T06:57:04Z) - ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data.
We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z) - 3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with
2D Diffusion Models [102.75875255071246]
3D content creation via text-driven stylization has played a fundamental challenge to multimedia and graphics community.
We propose a new 3DStyle-Diffusion model that triggers fine-grained stylization of 3D meshes with additional controllable appearance and geometric guidance from 2D Diffusion models.
arXiv Detail & Related papers (2023-11-09T15:51:27Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - ATT3D: Amortized Text-to-3D Object Synthesis [78.96673650638365]
We amortize optimization over text prompts by training on many prompts simultaneously with a unified model, instead of separately.
Our framework - Amortized text-to-3D (ATT3D) - enables knowledge-sharing between prompts to generalize to unseen setups and smooths between text for novel assets and simple animations.
arXiv Detail & Related papers (2023-06-06T17:59:10Z) - Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [21.622420436349245]
We present Text2Room, a method for generating room-scale textured 3D meshes from a given text prompt as input.
We leverage pre-trained 2D text-to-image models to synthesize a sequence of images from different poses.
In order to lift these outputs into a consistent 3D scene representation, we combine monocular depth estimation with a text-conditioned inpainting model.
arXiv Detail & Related papers (2023-03-21T16:21:02Z) - Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and
Text-to-Image Diffusion Models [44.34479731617561]
We introduce explicit 3D shape priors into the CLIP-guided 3D optimization process.
We present a simple yet effective approach that directly bridges the text and image modalities with a powerful text-to-image diffusion model.
Our method, Dream3D, is capable of generating imaginative 3D content with superior visual quality and shape accuracy.
arXiv Detail & Related papers (2022-12-28T18:23:47Z) - Learning to Stylize Novel Views [82.24095446809946]
We tackle a 3D scene stylization problem - generating stylized images of a scene from arbitrary novel views.
We propose a point cloud-based method for consistent 3D scene stylization.
arXiv Detail & Related papers (2021-05-27T23:58:18Z) - 3DSNet: Unsupervised Shape-to-Shape 3D Style Transfer [66.48720190245616]
We propose a learning-based approach for style transfer between 3D objects.
The proposed method can synthesize new 3D shapes both in the form of point clouds and meshes.
We extend our technique to implicitly learn the multimodal style distribution of the chosen domains.
arXiv Detail & Related papers (2020-11-26T16:59:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.