Related papers: Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection

Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection

URL: http://arxiv.org/abs/2603.04337v1
Date: Wed, 04 Mar 2026 17:55:01 GMT
Title: Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection
Authors: Dacheng Qi, Chenyu Wang, Jingwei Xu, Tianzhe Chu, Zibo Zhao, Wen Liu, Wenrui Ding, Yi Ma, Shenghua Gao,
Abstract summary: Large Language Models (LLMs) have inspired the LLM-based CAD generation by representing CAD as command sequences.<n>We present Pointer-CAD, a novel LLM-based CAD generation framework that incorporates the geometric information of B-rep models into sequential modeling.<n>Experiments demonstrate that Pointer-CAD effectively supports the generation of complex geometric structures and reduces segmentation error to an extremely low level.
Score: 36.418031479264585
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Constructing computer-aided design (CAD) models is labor-intensive but essential for engineering and manufacturing. Recent advances in Large Language Models (LLMs) have inspired the LLM-based CAD generation by representing CAD as command sequences. But these methods struggle in practical scenarios because command sequence representation does not support entity selection (e.g. faces or edges), limiting its ability to support complex editing operations such as chamfer or fillet. Further, the discretization of a continuous variable during sketch and extrude operations may result in topological errors. To address these limitations, we present Pointer-CAD, a novel LLM-based CAD generation framework that leverages a pointer-based command sequence representation to explicitly incorporate the geometric information of B-rep models into sequential modeling. In particular, Pointer-CAD decomposes CAD model generation into steps, conditioning the generation of each subsequent step on both the textual description and the B-rep generated from previous steps. Whenever an operation requires the selection of a specific geometric entity, the LLM predicts a Pointer that selects the most feature-consistent candidate from the available set. Such a selection operation also reduces the quantization error in the command sequence-based representation. To support the training of Pointer-CAD, we develop a data annotation pipeline that produces expert-level natural language descriptions and apply it to build a dataset of approximately 575K CAD models. Extensive experimental results demonstrate that Pointer-CAD effectively supports the generation of complex geometric structures and reduces segmentation error to an extremely low level, achieving a significant improvement over prior command sequence methods, thereby significantly mitigating the topological inaccuracies introduced by quantization error.

Related papers

BrepCoder: A Unified Multimodal Large Language Model for Multi-task B-rep Reasoning [4.393837288225634]
We propose BrepCoder, a Python-like Large Language Model (MLLM) that performs diverse CAD tasks from B-rep inputs.<n>By leveraging the code generation capabilities of LLMs, we convert CAD modeling sequences into Python-like code and align them with B-rep.
arXiv Detail & Related papers (2026-02-25T12:44:28Z)
HistCAD: Geometrically Constrained Parametric History-based CAD Dataset [7.7008607520955]
HistCAD is a large-scale dataset featuring constraint-aware modeling sequences.<n>HistCAD provides a unified benchmark for advancing editable, constraint-aware, and semantically enriched generative CAD modeling.
arXiv Detail & Related papers (2025-12-08T05:52:14Z)
ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models [16.220781575918256]
ReCAD is a reinforcement learning (RL) framework that bootstraps pretrained large models (PLMs) to generate precise parametric computer-aided design (CAD) models from multimodal inputs.<n>We employ a hierarchical primitive learning process to teach structured and compositional skills under a unified reward function.<n>ReCAD sets a new state-of-the-art in both text-to-CAD and image-to-CAD tasks, significantly improving geometric accuracy across in-distribution and out-of-distribution settings.
arXiv Detail & Related papers (2025-12-06T07:12:56Z)
BrepGPT: Autoregressive B-rep Generation with Voronoi Half-Patch [61.20046418942948]
Boundary representation (B-rep) is the de facto standard for CAD model representation in modern industrial design.<n>We present BrepGPT, a single-stage autoregressive framework for B-rep generation.
arXiv Detail & Related papers (2025-11-27T07:16:53Z)
CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization [16.26305802216836]
CAD-Tokenizer represents CAD data with modality-specific tokens using a sequence-based VQ-VAE with primitive-level pooling and constrained decoding.<n>This design produces compact, primitive-aware representations that align with CAD's structural nature.
arXiv Detail & Related papers (2025-09-25T13:38:36Z)
From Intent to Execution: Multimodal Chain-of-Thought Reinforcement Learning for Precise CAD Code Generation [47.67703214044401]
We propose CAD-RL, a multimodal Chain-of-Thought guided reinforcement learning framework for CAD modeling code generation.<n>Our method combines Cold Start with goal-driven reinforcement learning post training using three task-specific rewards.<n>Experiments demonstrate that CAD-RL achieves significant improvements in reasoning quality, output precision, and code executability.
arXiv Detail & Related papers (2025-08-13T18:30:49Z)
CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning [31.342222156939403]
We introduce CReFT-CAD, a two-stage fine-tuning paradigm that first employs a curriculum-driven reinforcement learning stage with difficulty-aware rewards to build reasoning ability steadily.<n>We release TriView2CAD, the first large-scale, open-source benchmark for orthographic projection reasoning.
arXiv Detail & Related papers (2025-05-31T13:52:56Z)
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images [69.7768227804928]
CADCrafter is an image-to-parametric CAD model generation framework that trains solely on synthetic textureless CAD data.<n>We introduce a geometry encoder to accurately capture diverse geometric features.<n>Our approach can robustly handle real unconstrained CAD images, and even generalize to unseen general objects.
arXiv Detail & Related papers (2025-04-07T06:01:35Z)
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM [39.113795259823476]
We introduce the CAD-MLLM, the first system capable of generating parametric CAD models conditioned on the multimodal input.<n>We use advanced large language models (LLMs) to align the feature space across diverse multi-modalities data and CAD models' vectorized representations.<n>Our resulting dataset, named Omni-CAD, is the first multimodal CAD dataset that contains textual description, multi-view images, points, and command sequence for each CAD model.
arXiv Detail & Related papers (2024-11-07T18:31:08Z)
PS-CAD: Local Geometry Guidance via Prompting and Selection for CAD Reconstruction [86.726941702182]
We introduce geometric guidance into the reconstruction network PS-CAD. We provide the geometry of surfaces where the current reconstruction differs from the complete model as a point cloud. Second, we use geometric analysis to extract a set of planar prompts, that correspond to candidate surfaces.
arXiv Detail & Related papers (2024-05-24T03:43:55Z)
AutoCAD: Automatically Generating Counterfactuals for Mitigating Shortcut Learning [70.70393006697383]
We present AutoCAD, a fully automatic and task-agnostic CAD generation framework. In this paper, we present AutoCAD, a fully automatic and task-agnostic CAD generation framework.
arXiv Detail & Related papers (2022-11-29T13:39:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.