A Solver-Aided Hierarchical Language for LLM-Driven CAD Design
- URL: http://arxiv.org/abs/2502.09819v1
- Date: Thu, 13 Feb 2025 23:31:30 GMT
- Title: A Solver-Aided Hierarchical Language for LLM-Driven CAD Design
- Authors: Benjamin T. Jones, Felix Hähnlein, Zihan Zhang, Maaz Ahmad, Vladimir Kim, Adriana Schulz,
- Abstract summary: Large language models (LLMs) have been enormously successful in solving a wide variety of structured and unstructured generative tasks.<n>They struggle to generate procedural geometry in Computer Aided Design (CAD)<n>We introduce a solver-aided, hierarchical domain specific language called AIDL, which offloads the spatial reasoning requirements to a geometric constraint solver.
- Score: 18.258735692299066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have been enormously successful in solving a wide variety of structured and unstructured generative tasks, but they struggle to generate procedural geometry in Computer Aided Design (CAD). These difficulties arise from an inability to do spatial reasoning and the necessity to guide a model through complex, long range planning to generate complex geometry. We enable generative CAD Design with LLMs through the introduction of a solver-aided, hierarchical domain specific language (DSL) called AIDL, which offloads the spatial reasoning requirements to a geometric constraint solver. Additionally, we show that in the few-shot regime, AIDL outperforms even a language with in-training data (OpenSCAD), both in terms of generating visual results closer to the prompt and creating objects that are easier to post-process and reason about.
Related papers
- Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection [36.418031479264585]
Large Language Models (LLMs) have inspired the LLM-based CAD generation by representing CAD as command sequences.<n>We present Pointer-CAD, a novel LLM-based CAD generation framework that incorporates the geometric information of B-rep models into sequential modeling.<n>Experiments demonstrate that Pointer-CAD effectively supports the generation of complex geometric structures and reduces segmentation error to an extremely low level.
arXiv Detail & Related papers (2026-03-04T17:55:01Z) - CADKnitter: Compositional CAD Generation from Text and Geometry Guidance [8.644079160190175]
We propose CADKnitter, a compositional CAD generation framework with a geometry-guided diffusion sampling strategy.<n>CADKnitter is able to generate a complementary CAD part that follows both the geometric constraints of the given CAD model and the semantic constraints of the desired design text prompt.<n>We also curate a dataset, so-called KnitCAD, containing over 310,000 samples of CAD models, along with textual prompts and assembly metadata.
arXiv Detail & Related papers (2025-12-12T01:06:38Z) - HistCAD: Geometrically Constrained Parametric History-based CAD Dataset [7.7008607520955]
HistCAD is a large-scale dataset featuring constraint-aware modeling sequences.<n>HistCAD provides a unified benchmark for advancing editable, constraint-aware, and semantically enriched generative CAD modeling.
arXiv Detail & Related papers (2025-12-08T05:52:14Z) - ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models [16.220781575918256]
ReCAD is a reinforcement learning (RL) framework that bootstraps pretrained large models (PLMs) to generate precise parametric computer-aided design (CAD) models from multimodal inputs.<n>We employ a hierarchical primitive learning process to teach structured and compositional skills under a unified reward function.<n>ReCAD sets a new state-of-the-art in both text-to-CAD and image-to-CAD tasks, significantly improving geometric accuracy across in-distribution and out-of-distribution settings.
arXiv Detail & Related papers (2025-12-06T07:12:56Z) - Data Dependency-Aware Code Generation from Enhanced UML Sequence Diagrams [54.528185120850274]
We propose a novel step-by-step code generation framework named API2Dep.<n>First, we introduce an enhanced Unified Modeling Language (UML) API diagram tailored for service-oriented architectures.<n>Second, recognizing the critical role of data flow, we introduce a dedicated data dependency inference task.
arXiv Detail & Related papers (2025-08-05T12:28:23Z) - Generative AI for CAD Automation: Leveraging Large Language Models for 3D Modelling [31.94035963354055]
Large Language Models (LLMs) are revolutionizing industries by enhancing efficiency, scalability, and innovation.<n>This paper investigates the potential of LLMs in automating Computer-Aided Design (CAD) by integrating FreeCAD with LLM as CAD design tool.<n>We propose a framework where LLMs generate initial CAD scripts from natural language descriptions, which are then executed and refined iteratively based on error feedback.
arXiv Detail & Related papers (2025-07-05T23:30:17Z) - "Don't Do That!": Guiding Embodied Systems through Large Language Model-based Constraint Generation [40.61171036032532]
Large language models (LLMs) have spurred interest in robotic navigation that incorporates complex constraints from natural language into the planning problem.<n>In this paper, we propose a constraint generation framework that uses LLMs to translate constraints into Python functions.<n>We show that these LLM-generated functions accurately describe even complex mathematical constraints, and apply them to point cloud representations with traditional search algorithms.
arXiv Detail & Related papers (2025-06-04T22:47:53Z) - Computational Thinking Reasoning in Large Language Models [69.28428524878885]
Computational Thinking Model (CTM) is a novel framework that incorporates computational thinking paradigms into large language models (LLMs)<n>Live code execution is seamlessly integrated into the reasoning process, allowing CTM to think by computing.<n>CTM outperforms conventional reasoning models and tool-augmented baselines in terms of accuracy, interpretability, and generalizability.
arXiv Detail & Related papers (2025-06-03T09:11:15Z) - CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation [16.212242362122947]
This study investigates the generation of parametric sequences for computer-aided design (CAD) models using Large Language Models (LLMs)<n>We present CAD-Llama, a framework designed to enhance pretrained LLMs for generating parametric 3D CAD models.
arXiv Detail & Related papers (2025-05-07T14:52:02Z) - Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural Integration [57.95306827012784]
We propose GeoGen, a pipeline that can automatically generate step-wise reasoning paths for geometry diagrams.
By leveraging the precise symbolic reasoning, textbfGeoGen produces large-scale, high-quality question-answer pairs.
We train textbfGeoLogic, a Large Language Model (LLM), using synthetic data generated by GeoGen.
arXiv Detail & Related papers (2025-04-17T09:13:46Z) - Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models.
DisCIPL uses a Planner model to generate a task-specific inference program.
Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z) - HiVeGen -- Hierarchical LLM-based Verilog Generation for Scalable Chip Design [55.54477725000291]
HiVeGen is a hierarchical Verilog generation framework that decomposes generation tasks into hierarchical submodules.<n> automatic Design Space Exploration (DSE) into hierarchy-aware prompt generation, introducing weight-based retrieval to enhance code reuse.<n>Real-time human-computer interaction to lower error-correction cost, significantly improving the quality of generated designs.
arXiv Detail & Related papers (2024-12-06T19:37:53Z) - Interactive and Expressive Code-Augmented Planning with Large Language Models [62.799579304821826]
Large Language Models (LLMs) demonstrate strong abilities in common-sense reasoning and interactive decision-making.
Recent techniques have sought to structure LLM outputs using control flow and other code-adjacent techniques to improve planning performance.
We propose REPL-Plan, an LLM planning approach that is fully code-expressive and dynamic.
arXiv Detail & Related papers (2024-11-21T04:23:17Z) - Mediating Modes of Thought: LLM's for design scripting [3.196599528747484]
Large Language Models (LLMs) encode a general understanding of human context and exhibit the capacity to produce geometric logic.
This project speculates that if LLMs can effectively mediate between user intent and algorithms, they become a powerful tool to make scripting in design more widespread and fun.
We explore if such systems can interpret natural language prompts to assemble geometric operations relevant to computational design scripting.
arXiv Detail & Related papers (2024-11-20T02:49:18Z) - CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM [39.113795259823476]
We introduce the CAD-MLLM, the first system capable of generating parametric CAD models conditioned on the multimodal input.
We use advanced large language models (LLMs) to align the feature space across diverse multi-modalities data and CAD models' vectorized representations.
Our resulting dataset, named Omni-CAD, is the first multimodal CAD dataset that contains textual description, multi-view images, points, and command sequence for each CAD model.
arXiv Detail & Related papers (2024-11-07T18:31:08Z) - Optimizing Token Usage on Large Language Model Conversations Using the Design Structure Matrix [49.1574468325115]
Large Language Models become ubiquitous in many sectors and tasks.
There is a need to reduce token usage, overcoming challenges such as short context windows, limited output sizes, and costs associated with token intake and generation.
This work brings the Design Structure Matrix from the engineering design discipline into LLM conversation optimization.
arXiv Detail & Related papers (2024-10-01T14:38:36Z) - GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors [3.796768352477804]
The creation of manufacturable and editable 3D shapes through Computer-Aided Design (CAD) remains a highly manual and time-consuming task.
This paper introduces GenCAD, a generative model that employs autoregressive transformers with a contrastive learning framework and latent diffusion models to transform image inputs into parametric CAD command sequences.
arXiv Detail & Related papers (2024-09-08T23:49:11Z) - Diagram Formalization Enhanced Multi-Modal Geometry Problem Solver [11.69164802295844]
We introduce a new framework that integrates visual features, geometric formal language, and natural language representations.
We propose a novel synthetic data approach and create a large-scale geometric dataset, SynthGeo228K, annotated with both formal and natural language captions.
Our framework improves MLLMs' ability to process geometric diagrams and extends their application to open-ended tasks on the formalgeo7k dataset.
arXiv Detail & Related papers (2024-09-06T12:11:06Z) - Nl2Hltl2Plan: Scaling Up Natural Language Understanding for Multi-Robots Through Hierarchical Temporal Logic Task Representation [8.180994118420053]
Nl2Hltl2Plan is a framework that translates natural language commands into hierarchical Linear Temporal Logic (LTL)<n>First, an LLM transforms instructions into a Hierarchical Task Tree, capturing logical and temporal relations.<n>Next, a fine-tuned LLM converts sub-tasks into flat formulas, which are aggregated into hierarchical specifications.
arXiv Detail & Related papers (2024-08-15T14:46:13Z) - Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure [66.33623392497599]
We show that a structure called template-content structure (T-C structure) can reduce the possible space from exponential level to linear level.
We demonstrate that models can achieve task composition, further reducing the space needed to learn from linear to logarithmic.
arXiv Detail & Related papers (2023-10-09T06:57:45Z) - Examining Scaling and Transfer of Language Model Architectures for
Machine Translation [51.69212730675345]
Language models (LMs) process sequences in a single stack of layers, and encoder-decoder models (EncDec) utilize separate layer stacks for input and output processing.
In machine translation, EncDec has long been the favoured approach, but with few studies investigating the performance of LMs.
arXiv Detail & Related papers (2022-02-01T16:20:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.