Related papers: Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning

Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning

URL: http://arxiv.org/abs/2405.08054v1
Date: Mon, 13 May 2024 17:56:13 GMT
Title: Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning
Authors: Wenqi Dong, Bangbang Yang, Lin Ma, Xiao Liu, Liyuan Cui, Hujun Bao, Yuewen Ma, Zhaopeng Cui,
Abstract summary: Coin3D allows users to control the 3D generation using a coarse geometry proxy assembled from basic shapes. Our method achieves superior controllability and flexibility in the 3D assets generation task.
Score: 52.81032340916171
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As humans, we aspire to create media content that is both freely willed and readily controlled. Thanks to the prominent development of generative techniques, we now can easily utilize 2D diffusion methods to synthesize images controlled by raw sketch or designated human poses, and even progressively edit/regenerate local regions with masked inpainting. However, similar workflows in 3D modeling tasks are still unavailable due to the lack of controllability and efficiency in 3D generation. In this paper, we present a novel controllable and interactive 3D assets modeling framework, named Coin3D. Coin3D allows users to control the 3D generation using a coarse geometry proxy assembled from basic shapes, and introduces an interactive generation workflow to support seamless local part editing while delivering responsive 3D object previewing within a few seconds. To this end, we develop several techniques, including the 3D adapter that applies volumetric coarse shape control to the diffusion model, proxy-bounded editing strategy for precise part editing, progressive volume cache to support responsive preview, and volume-SDS to ensure consistent mesh reconstruction. Extensive experiments of interactive generation and editing on diverse shape proxies demonstrate that our method achieves superior controllability and flexibility in the 3D assets generation task.

Related papers

CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation [58.46364872103992]
We introduce a new method called CMD that generates a 3D model from an input image while enabling flexible local editing of each component of the 3D model.<n>In CMD, we formulate the 3D generation as a conditional multiview diffusion model, which takes the existing or known parts as conditions and generates the edited or added components.
arXiv Detail & Related papers (2025-05-11T14:54:26Z)
SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation [44.354071773885735]
SceneFactor is a diffusion-based approach for large-scale 3D scene generation. It enables controllable generation and effortless editing. Our approach enables high-fidelity 3D scene synthesis with effective controllable editing.
arXiv Detail & Related papers (2024-12-02T18:47:41Z)
GaussianAnything: Interactive Point Cloud Flow Matching For 3D Object Generation [75.39457097832113]
This paper introduces a novel 3D generation framework, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information. The proposed method, GaussianAnything, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single image inputs.
arXiv Detail & Related papers (2024-11-12T18:59:32Z)
Manipulating Vehicle 3D Shapes through Latent Space Editing [0.0]
This paper introduces a framework that employs a pre-trained regressor, enabling continuous, precise, attribute-specific modifications to vehicle 3D models. Our method not only preserves the inherent identity of vehicle 3D objects, but also supports multi-attribute editing, allowing for extensive customization without compromising the model's structural integrity.
arXiv Detail & Related papers (2024-10-31T13:41:16Z)
iControl3D: An Interactive System for Controllable 3D Scene Generation [57.048647153684485]
iControl3D is a novel interactive system that empowers users to generate and render customizable 3D scenes with precise control. We leverage 3D meshes as an intermediary proxy to iteratively merge individual 2D diffusion-generated images into a cohesive and unified 3D scene representation. Our neural rendering interface enables users to build a radiance field of their scene online and navigate the entire scene.
arXiv Detail & Related papers (2024-08-03T06:35:09Z)
DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting. Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z)
Interactive3D: Create What You Want by Interactive 3D Generation [13.003964182554572]
We introduce Interactive3D, an innovative framework for interactive 3D generation that grants users precise control over the generative process. Our experiments demonstrate that Interactive3D markedly improves the controllability and quality of 3D generation.
arXiv Detail & Related papers (2024-04-25T11:06:57Z)
Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting [9.383423119196408]
We introduce Multi-view ControlNet (MVControl), a novel neural network architecture designed to enhance existing multi-view diffusion models. MVControl is able to offer 3D diffusion guidance for optimization-based 3D generation. In pursuit of efficiency, we adopt 3D Gaussians as our representation instead of the commonly used implicit representations.
arXiv Detail & Related papers (2024-03-15T02:57:20Z)
SERF: Fine-Grained Interactive 3D Segmentation and Editing with Radiance Fields [92.14328581392633]
We introduce a novel fine-grained interactive 3D segmentation and editing algorithm with radiance fields, which we refer to as SERF. Our method entails creating a neural mesh representation by integrating multi-view algorithms with pre-trained 2D models. Building upon this representation, we introduce a novel surface rendering technique that preserves local information and is robust to deformation.
arXiv Detail & Related papers (2023-12-26T02:50:42Z)
XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space. The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing. We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z)
Cross-Modal 3D Shape Generation and Manipulation [62.50628361920725]
We propose a generic multi-modal generative model that couples the 2D modalities and implicit 3D representations through shared latent spaces. We evaluate our framework on two representative 2D modalities of grayscale line sketches and rendered color images.
arXiv Detail & Related papers (2022-07-24T19:22:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.