FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models
- URL: http://arxiv.org/abs/2411.05823v1
- Date: Tue, 05 Nov 2024 05:45:26 GMT
- Title: FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models
- Authors: Zhanwei Zhang, Shizhao Sun, Wenxiao Wang, Deng Cai, Jiang Bian,
- Abstract summary: There is a growing interest in creating computer-aided design (CAD) models based on user intent.
Existing work offers limited controllability and needs separate models for different types of control.
We propose FlexCAD, a unified model by fine-tuning large language models.
- Score: 22.010338370150738
- License:
- Abstract: Recently, there is a growing interest in creating computer-aided design (CAD) models based on user intent, known as controllable CAD generation. Existing work offers limited controllability and needs separate models for different types of control, reducing efficiency and practicality. To achieve controllable generation across all CAD construction hierarchies, such as sketch-extrusion, extrusion, sketch, face, loop and curve, we propose FlexCAD, a unified model by fine-tuning large language models (LLMs). First, to enhance comprehension by LLMs, we represent a CAD model as a structured text by abstracting each hierarchy as a sequence of text tokens. Second, to address various controllable generation tasks in a unified model, we introduce a hierarchy-aware masking strategy. Specifically, during training, we mask a hierarchy-aware field in the CAD text with a mask token. This field, composed of a sequence of tokens, can be set flexibly to represent various hierarchies. Subsequently, we ask LLMs to predict this masked field. During inference, the user intent is converted into a CAD text with a mask token replacing the part the user wants to modify, which is then fed into FlexCAD to generate new CAD models. Comprehensive experiments on public dataset demonstrate the effectiveness of FlexCAD in both generation quality and controllability. Code will be available at https://github.com/microsoft/CADGeneration/FlexCAD.
Related papers
- Text2CAD: Text to 3D CAD Generation via Technical Drawings [45.3611544056261]
Text2CAD is a novel framework that employs stable diffusion models tailored to automate the generation process.
We show that Text2CAD effectively generates technical drawings that are accurately translated into high-quality 3D CAD models.
arXiv Detail & Related papers (2024-11-09T15:12:06Z) - CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM [39.113795259823476]
We introduce the CAD-MLLM, the first system capable of generating parametric CAD models conditioned on the multimodal input.
We use advanced large language models (LLMs) to align the feature space across diverse multi-modalities data and CAD models' vectorized representations.
Our resulting dataset, named Omni-CAD, is the first multimodal CAD dataset that contains textual description, multi-view images, points, and command sequence for each CAD model.
arXiv Detail & Related papers (2024-11-07T18:31:08Z) - Text2CAD: Generating Sequential CAD Models from Beginner-to-Expert Level Text Prompts [12.63158811936688]
We propose Text2CAD, the first AI framework for generating text-to-parametric CAD models.
Our proposed framework shows great potential in AI-aided design applications.
arXiv Detail & Related papers (2024-09-25T17:19:33Z) - PS-CAD: Local Geometry Guidance via Prompting and Selection for CAD Reconstruction [86.726941702182]
We introduce geometric guidance into the reconstruction network PS-CAD.
We provide the geometry of surfaces where the current reconstruction differs from the complete model as a point cloud.
Second, we use geometric analysis to extract a set of planar prompts, that correspond to candidate surfaces.
arXiv Detail & Related papers (2024-05-24T03:43:55Z) - ContrastCAD: Contrastive Learning-based Representation Learning for Computer-Aided Design Models [0.7373617024876725]
We propose a contrastive learning-based approach to learning CAD models, named ContrastCAD.
ContrastCAD effectively captures semantic information within the construction sequences of the CAD model.
We also propose a new CAD data augmentation method, called a Random Replace and Extrude (RRE) method, to enhance the learning performance of the model.
arXiv Detail & Related papers (2024-04-02T05:30:39Z) - Model2Scene: Learning 3D Scene Representation via Contrastive
Language-CAD Models Pre-training [105.3421541518582]
Current successful methods of 3D scene perception rely on the large-scale annotated point cloud.
We propose Model2Scene, a novel paradigm that learns free 3D scene representation from Computer-Aided Design (CAD) models and languages.
Model2Scene yields impressive label-free 3D object salient detection with an average mAP of 46.08% and 55.49% on the ScanNet and S3DIS datasets, respectively.
arXiv Detail & Related papers (2023-09-29T03:51:26Z) - Hierarchical Neural Coding for Controllable CAD Model Generation [34.14256897199849]
This paper presents a novel generative model for Computer Aided Design (CAD)
It represents high-level design concepts of a CAD model as a three-level hierarchical tree of neural codes.
It controls the generation or completion of CAD models by specifying the target design using a code tree.
arXiv Detail & Related papers (2023-06-30T21:49:41Z) - AutoCAD: Automatically Generating Counterfactuals for Mitigating
Shortcut Learning [70.70393006697383]
We present AutoCAD, a fully automatic and task-agnostic CAD generation framework.
In this paper, we present AutoCAD, a fully automatic and task-agnostic CAD generation framework.
arXiv Detail & Related papers (2022-11-29T13:39:53Z) - Masked Autoencoding for Scalable and Generalizable Decision Making [93.84855114717062]
MaskDP is a simple and scalable self-supervised pretraining method for reinforcement learning and behavioral cloning.
We find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching.
arXiv Detail & Related papers (2022-11-23T07:04:41Z) - SkexGen: Autoregressive Generation of CAD Construction Sequences with
Disentangled Codebooks [37.33746656109331]
We present SkexGen, a novel autoregressive generative model for computer-aided design (CAD) construction sequences.
Autoregressive Transformer decoders generate CAD construction sequences sharing certain properties specified by the codebook vectors.
arXiv Detail & Related papers (2022-07-11T05:10:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.