Related papers: OpenECAD: An Efficient Visual Language Model for Editable 3D-CAD Design

OpenECAD: An Efficient Visual Language Model for Editable 3D-CAD Design

URL: http://arxiv.org/abs/2406.09913v3
Date: Tue, 6 Aug 2024 08:27:55 GMT
Title: OpenECAD: An Efficient Visual Language Model for Editable 3D-CAD Design
Authors: Zhe Yuan, Jianqi Shi, Yanhong Huang,
Abstract summary: We fine-tuned pre-trained models to create OpenECAD models (0.55B, 0.89B, 2.4B and 3.1B) OpenECAD models can process images of 3D designs as input and generate highly structured 2D sketches and 3D construction commands. These outputs can be directly used with existing CAD tools' APIs to generate project files.
Score: 1.481550828146527
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Computer-aided design (CAD) tools are utilized in the manufacturing industry for modeling everything from cups to spacecraft. These programs are complex to use and typically require years of training and experience to master. Structured and well-constrained 2D sketches and 3D constructions are crucial components of CAD modeling. A well-executed CAD model can be seamlessly integrated into the manufacturing process, thereby enhancing production efficiency. Deep generative models of 3D shapes and 3D object reconstruction models have garnered significant research interest. However, most of these models produce discrete forms of 3D objects that are not editable. Moreover, the few models based on CAD operations often have substantial input restrictions. In this work, we fine-tuned pre-trained models to create OpenECAD models (0.55B, 0.89B, 2.4B and 3.1B), leveraging the visual, logical, coding, and general capabilities of visual language models. OpenECAD models can process images of 3D designs as input and generate highly structured 2D sketches and 3D construction commands, ensuring that the designs are editable. These outputs can be directly used with existing CAD tools' APIs to generate project files. To train our network, we created a series of OpenECAD datasets. These datasets are derived from existing public CAD datasets, adjusted and augmented to meet the specific requirements of vision language model (VLM) training. Additionally, we have introduced an approach that utilizes dependency relationships to define and generate sketches, further enriching the content and functionality of the datasets.

Related papers

CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design [10.105055422074734]
We introduce a new large-scale pipeline of more than 170k CAD models annotated with human-like descriptions.<n>Our experiments and ablation studies on both synthetic and human-annotated data demonstrate that CADmium is able to automate CAD design.
arXiv Detail & Related papers (2025-07-13T21:11:53Z)
RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base [112.72361202480154]
We present RAG-6DPose, a retrieval-augmented approach that leverages 3D CAD models as a knowledge base.<n> Experimental results on standard benchmarks and real-world robotic tasks demonstrate the effectiveness and robustness of our approach.
arXiv Detail & Related papers (2025-06-23T17:19:41Z)
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images [69.7768227804928]
CADCrafter is an image-to-parametric CAD model generation framework that trains solely on synthetic textureless CAD data. We introduce a geometry encoder to accurately capture diverse geometric features. Our approach can robustly handle real unconstrained CAD images, and even generalize to unseen general objects.
arXiv Detail & Related papers (2025-04-07T06:01:35Z)
Cube: A Roblox View of 3D Intelligence [67.43543266278154]
Foundation models trained on vast amounts of data have demonstrated remarkable reasoning and generation capabilities. We show how our tokenization scheme can be used in applications for text-to-shape generation, shape-to-text generation and text-to-scene generation. We conclude with a discussion outlining our path to building a fully unified foundation model for 3D intelligence.
arXiv Detail & Related papers (2025-03-19T17:52:17Z)
Image2CADSeq: Computer-Aided Design Sequence and Knowledge Inference from Product Images [0.7673339435080445]
In scenarios where digital CAD files are not accessible, reverse engineering (RE) has been used to reconstruct 3D CAD models. Recent advances have seen the rise of data-driven approaches for RE, with a primary focus on converting 3D data, such as point clouds, into 3D models in boundary representation (B-rep) format. Our research introduces a novel data-driven approach with an Image2CADSeq neural network model.
arXiv Detail & Related papers (2025-01-09T02:36:21Z)
BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement [45.19076032719869]
We present BlenderLLM, a framework for training Large Language Models (LLMs) in Computer-Aided Design (CAD) Our results reveal that existing models demonstrate significant limitations in generating accurate CAD scripts. Through minimal instruction-based fine-tuning and iterative self-improvement, BlenderLLM significantly surpasses these models in both functionality and accuracy of CAD script generation.
arXiv Detail & Related papers (2024-12-16T14:34:02Z)
UrbanCAD: Towards Highly Controllable and Photorealistic 3D Vehicles for Urban Scene Simulation [46.47972242593905]
UrbanCAD is a framework that generates highly controllable and photorealistic 3D vehicle digital twins from a single urban image. This enables vehicles' realistic 360-degree rendering, background insertion, material transfer, relighting, and component manipulation.
arXiv Detail & Related papers (2024-11-28T17:51:08Z)
Text2CAD: Text to 3D CAD Generation via Technical Drawings [45.3611544056261]
Text2CAD is a novel framework that employs stable diffusion models tailored to automate the generation process. We show that Text2CAD effectively generates technical drawings that are accurately translated into high-quality 3D CAD models.
arXiv Detail & Related papers (2024-11-09T15:12:06Z)
Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry [12.265852643914439]
We present Img2CAD, the first knowledge that uses 2D image inputs to generate editable parameters. Img2CAD enables seamless integration between AI 3D reconstruction and CAD representation.
arXiv Detail & Related papers (2024-10-04T13:27:52Z)
GenCAD: Image-Conditioned Computer-Aided Design Generation with Transformer-Based Contrastive Representation and Diffusion Priors [3.796768352477804]
The creation of manufacturable and editable 3D shapes through Computer-Aided Design (CAD) remains a highly manual and time-consuming task. This paper introduces GenCAD, a generative model that employs autoregressive transformers with a contrastive learning framework and latent diffusion models to transform image inputs into parametric CAD command sequences.
arXiv Detail & Related papers (2024-09-08T23:49:11Z)
Geometric Deep Learning for Computer-Aided Design: A Survey [85.79012726689511]
This survey offers a comprehensive overview of learning-based methods in computer-aided design. It includes similarity analysis and retrieval, 2D and 3D CAD model synthesis, and CAD generation from point clouds. It provides a complete list of benchmark datasets and their characteristics, along with open-source codes that have propelled research in this domain.
arXiv Detail & Related papers (2024-02-27T17:11:35Z)
Model2Scene: Learning 3D Scene Representation via Contrastive Language-CAD Models Pre-training [105.3421541518582]
Current successful methods of 3D scene perception rely on the large-scale annotated point cloud. We propose Model2Scene, a novel paradigm that learns free 3D scene representation from Computer-Aided Design (CAD) models and languages. Model2Scene yields impressive label-free 3D object salient detection with an average mAP of 46.08% and 55.49% on the ScanNet and S3DIS datasets, respectively.
arXiv Detail & Related papers (2023-09-29T03:51:26Z)
SECAD-Net: Self-Supervised CAD Reconstruction by Learning Sketch-Extrude Operations [21.000539206470897]
SECAD-Net is an end-to-end neural network aimed at reconstructing compact and easy-to-edit CAD models. We show superiority over state-of-the-art alternatives including the closely related method for supervised CAD reconstruction.
arXiv Detail & Related papers (2023-03-19T09:26:03Z)
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images [72.15855070133425]
We introduce GET3D, a Generative model that directly generates Explicit Textured 3D meshes with complex topology, rich geometric details, and high-fidelity textures. GET3D is able to generate high-quality 3D textured meshes, ranging from cars, chairs, animals, motorbikes and human characters to buildings.
arXiv Detail & Related papers (2022-09-22T17:16:19Z)
'CADSketchNet' -- An Annotated Sketch dataset for 3D CAD Model Retrieval with Deep Neural Networks [0.8155575318208631]
The research work presented in this paper aims at developing a dataset suitable for building a retrieval system for 3D CAD models based on deep learning. The paper also aims at evaluating the performance of various retrieval system or a search engine for 3D CAD models that accepts a sketch image as the input query.
arXiv Detail & Related papers (2021-07-13T16:10:16Z)
DeepCAD: A Deep Generative Network for Computer-Aided Design Models [37.655225142981564]
We present the first 3D generative model for a drastically different shape representation -- describing a shape as a sequence of computer-aided design (CAD) operations. Drawing an analogy between CAD operations and natural language, we propose a CAD generative network based on the Transformer.
arXiv Detail & Related papers (2021-05-20T03:29:18Z)
Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis and Analysis [143.22192229456306]
This paper proposes a deep 3D energy-based model to represent volumetric shapes. The benefits of the proposed model are six-fold. Experiments demonstrate that the proposed model can generate high-quality 3D shape patterns.
arXiv Detail & Related papers (2020-12-25T06:09:36Z)
Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve [54.054575408582565]
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image. We present Mask2CAD, which jointly detects objects in real-world images and for each detected object, optimize for the most similar CAD model and its pose. This produces a clean, lightweight representation of the objects in an image.
arXiv Detail & Related papers (2020-07-26T00:08:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.