Related papers: 3D-PreMise: Can Large Language Models Generate 3D Shapes with Sharp Features and Parametric Control?

3D-PreMise: Can Large Language Models Generate 3D Shapes with Sharp Features and Parametric Control?

URL: http://arxiv.org/abs/2401.06437v1
Date: Fri, 12 Jan 2024 08:07:52 GMT
Title: 3D-PreMise: Can Large Language Models Generate 3D Shapes with Sharp Features and Parametric Control?
Authors: Zeqing Yuan, Haoxuan Lan, Qiang Zou, Junbo Zhao
Abstract summary: We introduce a framework that employs Large Language Models to generate text-driven 3D shapes. We present 3D-PreMise, a dataset specifically tailored for 3D parametric modeling of industrial shapes.
Score: 8.893200442359518
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advancements in implicit 3D representations and generative models have markedly propelled the field of 3D object generation forward. However, it remains a significant challenge to accurately model geometries with defined sharp features under parametric controls, which is crucial in fields like industrial design and manufacturing. To bridge this gap, we introduce a framework that employs Large Language Models (LLMs) to generate text-driven 3D shapes, manipulating 3D software via program synthesis. We present 3D-PreMise, a dataset specifically tailored for 3D parametric modeling of industrial shapes, designed to explore state-of-the-art LLMs within our proposed pipeline. Our work reveals effective generation strategies and delves into the self-correction capabilities of LLMs using a visual interface. Our work highlights both the potential and limitations of LLMs in 3D parametric modeling for industrial applications.

Related papers

3D-Generalist: Self-Improving Vision-Language-Action Models for Crafting 3D Worlds [23.329458437342684]
We propose a scalable method for generating high-quality 3D environments that can serve as training data for foundation models.<n>Our proposed framework, 3D-Generalist, trains Vision-Language-Models to generate more prompt-aligned 3D environments.<n>We demonstrate its quality and scalability in synthetic data generation by pretraining a vision foundation model on the generated data.
arXiv Detail & Related papers (2025-07-09T02:00:17Z)
CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation [16.212242362122947]
This study investigates the generation of parametric sequences for computer-aided design (CAD) models using Large Language Models (LLMs)<n>We present CAD-Llama, a framework designed to enhance pretrained LLMs for generating parametric 3D CAD models.
arXiv Detail & Related papers (2025-05-07T14:52:02Z)
3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow [69.94527569577295]
3D vision and spatial reasoning have long been recognized as preferable for accurately perceiving our three-dimensional world. Due to the difficulties in collecting high-quality 3D data, research in this area has only recently gained momentum. We propose converting existing densely activated LLMs into mixture-of-experts (MoE) models, which have proven effective for multi-modal data processing.
arXiv Detail & Related papers (2025-01-28T04:31:19Z)
Structured 3D Latents for Scalable and Versatile 3D Generation [28.672494137267837]
We introduce a novel 3D generation method for versatile and high-quality 3D asset creation. The cornerstone is a unified Structured LATent representation which allows decoding to different output formats. This is achieved by integrating a sparsely-populated 3D grid with dense multiview visual features extracted from a powerful vision foundation model.
arXiv Detail & Related papers (2024-12-02T13:58:38Z)
LLMI3D: Empowering LLM with 3D Perception from a Single 2D Image [72.14973729674995]
Current 3D perception methods, particularly small models, struggle with processing logical reasoning, question-answering, and handling open scenario categories. We propose solutions: Spatial-Enhanced Local Feature Mining for better spatial feature extraction, 3D Query Token-Derived Info Decoding for precise geometric regression, and Geometry Projection-Based 3D Reasoning for handling camera focal length variations.
arXiv Detail & Related papers (2024-08-14T10:00:16Z)
CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets [43.315487682462845]
CLAY is a 3D geometry and material generator designed to transform human imagination into intricate 3D digital structures. At its core is a large-scale generative model composed of a multi-resolution Variational Autoencoder (VAE) and a minimalistic latent Diffusion Transformer (DiT) We demonstrate using CLAY for a range of controllable 3D asset creations, from sketchy conceptual designs to production ready assets with intricate details.
arXiv Detail & Related papers (2024-05-30T05:57:36Z)
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space. We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z)
3D-GPT: Procedural 3D Modeling with Large Language Models [47.72968643115063]
We introduce 3D-GPT, a framework utilizing large language models(LLMs) for instruction-driven 3D modeling. 3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task. Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers.
arXiv Detail & Related papers (2023-10-19T17:41:48Z)
Pushing the Limits of 3D Shape Generation at Scale [65.24420181727615]
We present a significant breakthrough in 3D shape generation by scaling it to unprecedented dimensions. We have developed a model with an astounding 3.6 billion trainable parameters, establishing it as the largest 3D shape generation model to date, named Argus-3D.
arXiv Detail & Related papers (2023-06-20T13:01:19Z)
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images [72.15855070133425]
We introduce GET3D, a Generative model that directly generates Explicit Textured 3D meshes with complex topology, rich geometric details, and high-fidelity textures. GET3D is able to generate high-quality 3D textured meshes, ranging from cars, chairs, animals, motorbikes and human characters to buildings.
arXiv Detail & Related papers (2022-09-22T17:16:19Z)
Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis and Analysis [143.22192229456306]
This paper proposes a deep 3D energy-based model to represent volumetric shapes. The benefits of the proposed model are six-fold. Experiments demonstrate that the proposed model can generate high-quality 3D shape patterns.
arXiv Detail & Related papers (2020-12-25T06:09:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.