Related papers: Proc3D: Procedural 3D Generation and Parametric Editing of 3D Shapes with Large Language Models

Proc3D: Procedural 3D Generation and Parametric Editing of 3D Shapes with Large Language Models

URL: http://arxiv.org/abs/2601.12234v1
Date: Sun, 18 Jan 2026 03:08:08 GMT
Title: Proc3D: Procedural 3D Generation and Parametric Editing of 3D Shapes with Large Language Models
Authors: Fadlullah Raji, Stefano Petrangeli, Matheus Gadelha, Yu Shen, Uttaran Bhattacharya, Gang Wu,
Abstract summary: Proc3D is a system designed to generate editable 3D models while enabling real-time modifications.<n>At its core, Proc3D introduces procedural compact graph (PCG), a graph representation of 3D models.<n>We demonstrate Proc3D's capabilities using two generative approaches: GPT-4o with in-context learning (ICL) and a fine-tuned LLAMA-3 model.
Score: 21.349049326631867
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generating 3D models has traditionally been a complex task requiring specialized expertise. While recent advances in generative AI have sought to automate this process, existing methods produce non-editable representation, such as meshes or point clouds, limiting their adaptability for iterative design. In this paper, we introduce Proc3D, a system designed to generate editable 3D models while enabling real-time modifications. At its core, Proc3D introduces procedural compact graph (PCG), a graph representation of 3D models, that encodes the algorithmic rules and structures necessary for generating the model. This representation exposes key parameters, allowing intuitive manual adjustments via sliders and checkboxes, as well as real-time, automated modifications through natural language prompts using Large Language Models (LLMs). We demonstrate Proc3D's capabilities using two generative approaches: GPT-4o with in-context learning (ICL) and a fine-tuned LLAMA-3 model. Experimental results show that Proc3D outperforms existing methods in editing efficiency, achieving more than 400x speedup over conventional approaches that require full regeneration for each modification. Additionally, Proc3D improves ULIP scores by 28%, a metric that evaluates the alignment between generated 3D models and text prompts. By enabling text-aligned 3D model generation along with precise, real-time parametric edits, Proc3D facilitates highly accurate text-based image editing applications.

Related papers

3D-LATTE: Latent Space 3D Editing from Textual Instructions [64.77718887666312]
We propose a training-free editing method that operates within the latent space of a native 3D diffusion model.<n>We guide the edit synthesis by blending 3D attention maps from the generation with the source object.
arXiv Detail & Related papers (2025-08-29T22:51:59Z)
CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation [58.46364872103992]
We introduce a new method called CMD that generates a 3D model from an input image while enabling flexible local editing of each component of the 3D model.<n>In CMD, we formulate the 3D generation as a conditional multiview diffusion model, which takes the existing or known parts as conditions and generates the edited or added components.
arXiv Detail & Related papers (2025-05-11T14:54:26Z)
Instructive3D: Editing Large Reconstruction Models with Text Instructions [2.9575146209034853]
Instructive3D is a novel LRM based model that integrates generation and fine-grained editing, through user text prompts, of 3D objects into a single model.<n>We find that Instructive3D produces superior 3D objects with the properties specified by the edit prompts.
arXiv Detail & Related papers (2025-01-08T09:28:25Z)
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models [51.1972329762843]
We present a family of generative pre-trained auto-regressive models, which addresses the process of 3D mesh generation with modern large language model approaches. MeshXL is able to generate high-quality 3D meshes, and can also serve as foundation models for various down-stream applications.
arXiv Detail & Related papers (2024-05-31T14:35:35Z)
OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation [0.0]
One image to editable dynamic 3D model and video generation is novel direction and change in the research area of single image to 3D representation or 3D reconstruction of image. We propose the OneTo3D, a method and theory to used one single image to generate the editable 3D model and generate the targeted semantic continuous time-unlimited 3D video.
arXiv Detail & Related papers (2024-05-10T15:44:11Z)
LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis [76.43669909525488]
LATTE3D generates 3D objects in 400ms, and can be further enhanced with fast test-time optimization. We introduce LATTE3D, addressing these limitations to achieve fast, high-quality generation on a significantly larger prompt set.
arXiv Detail & Related papers (2024-03-22T17:59:37Z)
3D-GPT: Procedural 3D Modeling with Large Language Models [47.72968643115063]
We introduce 3D-GPT, a framework utilizing large language models(LLMs) for instruction-driven 3D modeling. 3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task. Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers.
arXiv Detail & Related papers (2023-10-19T17:41:48Z)
Directional Texture Editing for 3D Models [51.31499400557996]
ITEM3D is designed for automatic textbf3D object editing according to the text textbfInstructions. Leveraging the diffusion models and the differentiable rendering, ITEM3D takes the rendered images as the bridge of text and 3D representation.
arXiv Detail & Related papers (2023-09-26T12:01:13Z)
GET3D--: Learning GET3D from Unconstrained Image Collections [27.470617383305726]
We propose GET3D--, the first method that directly generates textured 3D shapes from 2D images with unknown pose and scale. GET3D-- comprises a 3D shape generator and a learnable camera sampler that captures the 6D external changes on the camera.
arXiv Detail & Related papers (2023-07-27T15:00:54Z)
Training and Tuning Generative Neural Radiance Fields for Attribute-Conditional 3D-Aware Face Generation [66.21121745446345]
We propose a conditional GNeRF model that integrates specific attribute labels as input, thus amplifying the controllability and disentanglement capabilities of 3D-aware generative models. Our approach builds upon a pre-trained 3D-aware face model, and we introduce a Training as Init and fidelity for Tuning (TRIOT) method to train a conditional normalized flow module. Our experiments substantiate the efficacy of our model, showcasing its ability to generate high-quality edits with enhanced view consistency.
arXiv Detail & Related papers (2022-08-26T10:05:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.