Related papers: CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation

CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation

URL: http://arxiv.org/abs/2505.04481v2
Date: Tue, 10 Jun 2025 13:44:51 GMT
Title: CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation
Authors: Jiahao Li, Weijian Ma, Xueyang Li, Yunzhong Lou, Guichun Zhou, Xiangdong Zhou,
Abstract summary: This study investigates the generation of parametric sequences for computer-aided design (CAD) models using Large Language Models (LLMs)<n>We present CAD-Llama, a framework designed to enhance pretrained LLMs for generating parametric 3D CAD models.
Score: 16.212242362122947
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, Large Language Models (LLMs) have achieved significant success, prompting increased interest in expanding their generative capabilities beyond general text into domain-specific areas. This study investigates the generation of parametric sequences for computer-aided design (CAD) models using LLMs. This endeavor represents an initial step towards creating parametric 3D shapes with LLMs, as CAD model parameters directly correlate with shapes in three-dimensional space. Despite the formidable generative capacities of LLMs, this task remains challenging, as these models neither encounter parametric sequences during their pretraining phase nor possess direct awareness of 3D structures. To address this, we present CAD-Llama, a framework designed to enhance pretrained LLMs for generating parametric 3D CAD models. Specifically, we develop a hierarchical annotation pipeline and a code-like format to translate parametric 3D CAD command sequences into Structured Parametric CAD Code (SPCC), incorporating hierarchical semantic descriptions. Furthermore, we propose an adaptive pretraining approach utilizing SPCC, followed by an instruction tuning process aligned with CAD-specific guidelines. This methodology aims to equip LLMs with the spatial knowledge inherent in parametric sequences. Experimental results demonstrate that our framework significantly outperforms prior autoregressive methods and existing LLM baselines.

Related papers

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh [79.20802127426003]
MeshLLM is a framework that leverages large language models (LLMs) to understand and generate text-serialized 3D meshes.<n>We introduce a Primitive-Mesh decomposition strategy, which divides 3D meshes into structurally meaningful subunits.<n> Experiments show that MeshLLM outperforms the state-of-the-art LLaMA-Mesh in both mesh generation quality and shape understanding.
arXiv Detail & Related papers (2025-08-02T07:37:37Z)
SeqAffordSplat: Scene-level Sequential Affordance Reasoning on 3D Gaussian Splatting [85.87902260102652]
We introduce the novel task of Sequential 3D Gaussian Affordance Reasoning.<n>We then propose SeqSplatNet, an end-to-end framework that directly maps an instruction to a sequence of 3D affordance masks.<n>Our method sets a new state-of-the-art on our challenging benchmark, effectively advancing affordance reasoning from single-step interactions to complex, sequential tasks at the scene level.
arXiv Detail & Related papers (2025-07-31T17:56:55Z)
CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning [50.867869718716555]
We introduce CReFT-CAD, a two-stage fine-tuning paradigm that first employs a curriculum-driven reinforcement learning stage with difficulty-aware rewards to build reasoning ability steadily.<n>We release TriView2CAD, the first large-scale, open-source benchmark for orthographic projection reasoning.
arXiv Detail & Related papers (2025-05-31T13:52:56Z)
Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek [19.441404313543227]
This study is the first investigation to incorporate both visual and Chain-of-Thought (CoT) feedback within the self-refinement mechanism for generating CAD models.<n>We present an innovative 3D CAD model dataset structured around the SSR (Sketch, Sketch-based feature, and Refinements) triple design paradigm.
arXiv Detail & Related papers (2025-05-23T10:11:19Z)
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images [69.7768227804928]
CADCrafter is an image-to-parametric CAD model generation framework that trains solely on synthetic textureless CAD data.<n>We introduce a geometry encoder to accurately capture diverse geometric features.<n>Our approach can robustly handle real unconstrained CAD images, and even generalize to unseen general objects.
arXiv Detail & Related papers (2025-04-07T06:01:35Z)
Text-to-CAD Generation Through Infusing Visual Feedback in Large Language Models [8.216545561416416]
We introduce CADFusion, a framework that uses Large Language Models (LLMs) as the backbone and alternates between two training stages.<n> Experiments demonstrate that CADFusion significantly improves performance, both qualitatively and quantitatively.
arXiv Detail & Related papers (2025-01-31T11:28:16Z)
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs [15.505120320280007]
This work introduces CAD-GPT, a CAD synthesis method with spatial reasoning-enhanced MLLM.<n>It maps 3D spatial positions and 3D sketch plane rotation angles into a 1D linguistic feature space using a specialized spatial unfolding mechanism.<n>It also discretizes 2D sketch coordinates into an appropriate planar space to enable precise determination of spatial starting position, sketch orientation, and 2D sketch coordinate translations.
arXiv Detail & Related papers (2024-12-27T14:19:36Z)
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM [39.113795259823476]
We introduce the CAD-MLLM, the first system capable of generating parametric CAD models conditioned on the multimodal input.<n>We use advanced large language models (LLMs) to align the feature space across diverse multi-modalities data and CAD models' vectorized representations.<n>Our resulting dataset, named Omni-CAD, is the first multimodal CAD dataset that contains textual description, multi-view images, points, and command sequence for each CAD model.
arXiv Detail & Related papers (2024-11-07T18:31:08Z)
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space. We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z)
3D-PreMise: Can Large Language Models Generate 3D Shapes with Sharp Features and Parametric Control? [8.893200442359518]
We introduce a framework that employs Large Language Models to generate text-driven 3D shapes. We present 3D-PreMise, a dataset specifically tailored for 3D parametric modeling of industrial shapes.
arXiv Detail & Related papers (2024-01-12T08:07:52Z)
3D-GPT: Procedural 3D Modeling with Large Language Models [47.72968643115063]
We introduce 3D-GPT, a framework utilizing large language models(LLMs) for instruction-driven 3D modeling. 3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task. Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers.
arXiv Detail & Related papers (2023-10-19T17:41:48Z)
Learning Versatile 3D Shape Generation with Improved AR Models [91.87115744375052]
Auto-regressive (AR) models have achieved impressive results in 2D image generation by modeling joint distributions in the grid space. We propose the Improved Auto-regressive Model (ImAM) for 3D shape generation, which applies discrete representation learning based on a latent vector instead of volumetric grids.
arXiv Detail & Related papers (2023-03-26T12:03:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.