Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
- URL: http://arxiv.org/abs/2212.03293v2
- Date: Sun, 7 May 2023 18:46:50 GMT
- Title: Diffusion-SDF: Text-to-Shape via Voxelized Diffusion
- Authors: Muheng Li, Yueqi Duan, Jie Zhou, Jiwen Lu
- Abstract summary: We propose a new generative 3D modeling framework called Diffusion-SDF for the challenging task of text-to-shape synthesis.
We show that Diffusion-SDF generates both higher quality and more diversified 3D shapes that conform well to given text descriptions.
- Score: 90.85011923436593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rising industrial attention to 3D virtual modeling technology,
generating novel 3D content based on specified conditions (e.g. text) has
become a hot issue. In this paper, we propose a new generative 3D modeling
framework called Diffusion-SDF for the challenging task of text-to-shape
synthesis. Previous approaches lack flexibility in both 3D data representation
and shape generation, thereby failing to generate highly diversified 3D shapes
conforming to the given text descriptions. To address this, we propose a SDF
autoencoder together with the Voxelized Diffusion model to learn and generate
representations for voxelized signed distance fields (SDFs) of 3D shapes.
Specifically, we design a novel UinU-Net architecture that implants a
local-focused inner network inside the standard U-Net architecture, which
enables better reconstruction of patch-independent SDF representations. We
extend our approach to further text-to-shape tasks including text-conditioned
shape completion and manipulation. Experimental results show that Diffusion-SDF
generates both higher quality and more diversified 3D shapes that conform well
to given text descriptions when compared to previous approaches. Code is
available at: https://github.com/ttlmh/Diffusion-SDF
Related papers
- GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation [75.39457097832113]
This paper introduces a novel 3D generation framework, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space.
Our framework employs a Variational Autoencoder with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information.
The proposed method, GaussianAnything, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single/multi-view image inputs.
arXiv Detail & Related papers (2024-11-12T18:59:32Z) - SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image [19.704369289729897]
We focus on recovering 3D object pose and shape from single images.
Recent work relies mostly on learning from finite datasets, so it struggles generalizing.
We tackle these limitations with a novel framework, called SDFit.
arXiv Detail & Related papers (2024-09-24T15:22:04Z) - UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion [51.31220416754788]
We present UDiFF, a 3D diffusion model for unsigned distance fields (UDFs) which is capable to generate textured 3D shapes with open surfaces from text conditions or unconditionally.
Our key idea is to generate UDFs in spatial-frequency domain with an optimal wavelet transformation, which produces a compact representation space for UDF generation.
arXiv Detail & Related papers (2024-04-10T09:24:54Z) - Mosaic-SDF for 3D Generative Models [41.4585856558786]
When training a diffusion or flow models on 3D shapes a crucial design choice is the shape representation.
We introduce Mosaic-SDF, a simple 3D shape representation that approximates the Signed Distance Function (SDF) of a given shape.
We demonstrate the efficacy of the M-SDF representation by using it to train a 3D generative flow model.
arXiv Detail & Related papers (2023-12-14T18:52:52Z) - 3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with
2D Diffusion Models [102.75875255071246]
3D content creation via text-driven stylization has played a fundamental challenge to multimedia and graphics community.
We propose a new 3DStyle-Diffusion model that triggers fine-grained stylization of 3D meshes with additional controllable appearance and geometric guidance from 2D Diffusion models.
arXiv Detail & Related papers (2023-11-09T15:51:27Z) - EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape
Generation [124.27302003578903]
This paper presents a new text-guided technique for generating 3D shapes.
We leverage a hybrid 3D representation, namely EXIM, combining the strengths of explicit and implicit representations.
We demonstrate the applicability of our approach to generate indoor scenes with consistent styles using text-induced 3D shapes.
arXiv Detail & Related papers (2023-11-03T05:01:51Z) - Locally Attentional SDF Diffusion for Controllable 3D Shape Generation [24.83724829092307]
We propose a diffusion-based 3D generation framework, to model plausible 3D shapes, via 2D sketch image input.
Our method is built on a two-stage diffusion model. The first stage, named occupancy-diffusion, aims to generate a low-resolution occupancy field to approximate the shape shell.
The second stage, named SDF-diffusion, synthesizes a high-resolution signed distance field within the occupied voxels determined by the first stage to extract fine geometry.
arXiv Detail & Related papers (2023-05-08T05:07:23Z) - Towards Implicit Text-Guided 3D Shape Generation [81.22491096132507]
This work explores the challenging task of generating 3D shapes from text.
We propose a new approach for text-guided 3D shape generation, capable of producing high-fidelity shapes with colors that match the given text description.
arXiv Detail & Related papers (2022-03-28T10:20:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.