3D-PreMise: Can Large Language Models Generate 3D Shapes with Sharp
Features and Parametric Control?
- URL: http://arxiv.org/abs/2401.06437v1
- Date: Fri, 12 Jan 2024 08:07:52 GMT
- Title: 3D-PreMise: Can Large Language Models Generate 3D Shapes with Sharp
Features and Parametric Control?
- Authors: Zeqing Yuan, Haoxuan Lan, Qiang Zou, Junbo Zhao
- Abstract summary: We introduce a framework that employs Large Language Models to generate text-driven 3D shapes.
We present 3D-PreMise, a dataset specifically tailored for 3D parametric modeling of industrial shapes.
- Score: 8.893200442359518
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in implicit 3D representations and generative models have
markedly propelled the field of 3D object generation forward. However, it
remains a significant challenge to accurately model geometries with defined
sharp features under parametric controls, which is crucial in fields like
industrial design and manufacturing. To bridge this gap, we introduce a
framework that employs Large Language Models (LLMs) to generate text-driven 3D
shapes, manipulating 3D software via program synthesis. We present 3D-PreMise,
a dataset specifically tailored for 3D parametric modeling of industrial
shapes, designed to explore state-of-the-art LLMs within our proposed pipeline.
Our work reveals effective generation strategies and delves into the
self-correction capabilities of LLMs using a visual interface. Our work
highlights both the potential and limitations of LLMs in 3D parametric modeling
for industrial applications.
Related papers
- 3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow [69.94527569577295]
3D vision and spatial reasoning have long been recognized as preferable for accurately perceiving our three-dimensional world.
Due to the difficulties in collecting high-quality 3D data, research in this area has only recently gained momentum.
We propose converting existing densely activated LLMs into mixture-of-experts (MoE) models, which have proven effective for multi-modal data processing.
arXiv Detail & Related papers (2025-01-28T04:31:19Z) - Structured 3D Latents for Scalable and Versatile 3D Generation [28.672494137267837]
We introduce a novel 3D generation method for versatile and high-quality 3D asset creation.
The cornerstone is a unified Structured LATent representation which allows decoding to different output formats.
This is achieved by integrating a sparsely-populated 3D grid with dense multiview visual features extracted from a powerful vision foundation model.
arXiv Detail & Related papers (2024-12-02T13:58:38Z) - LLMI3D: MLLM-based 3D Perception from a Single 2D Image [77.13869413871028]
multimodal large language models (MLLMs) excel in general capacity but underperform in 3D tasks.
In this paper, we propose solutions for weak 3D local spatial object perception, poor text-based geometric numerical output, and inability to handle camera focal variations.
We employ parameter-efficient fine-tuning for a pre-trained MLLM and develop LLMI3D, a powerful 3D perception MLLM.
arXiv Detail & Related papers (2024-08-14T10:00:16Z) - 3D-GPT: Procedural 3D Modeling with Large Language Models [47.72968643115063]
We introduce 3D-GPT, a framework utilizing large language models(LLMs) for instruction-driven 3D modeling.
3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task.
Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers.
arXiv Detail & Related papers (2023-10-19T17:41:48Z) - Pushing the Limits of 3D Shape Generation at Scale [65.24420181727615]
We present a significant breakthrough in 3D shape generation by scaling it to unprecedented dimensions.
We have developed a model with an astounding 3.6 billion trainable parameters, establishing it as the largest 3D shape generation model to date, named Argus-3D.
arXiv Detail & Related papers (2023-06-20T13:01:19Z) - GET3D: A Generative Model of High Quality 3D Textured Shapes Learned
from Images [72.15855070133425]
We introduce GET3D, a Generative model that directly generates Explicit Textured 3D meshes with complex topology, rich geometric details, and high-fidelity textures.
GET3D is able to generate high-quality 3D textured meshes, ranging from cars, chairs, animals, motorbikes and human characters to buildings.
arXiv Detail & Related papers (2022-09-22T17:16:19Z) - Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis
and Analysis [143.22192229456306]
This paper proposes a deep 3D energy-based model to represent volumetric shapes.
The benefits of the proposed model are six-fold.
Experiments demonstrate that the proposed model can generate high-quality 3D shape patterns.
arXiv Detail & Related papers (2020-12-25T06:09:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.