Related papers: CADSpotting: Robust Panoptic Symbol Spotting on Large-Scale CAD Drawings

CADSpotting: Robust Panoptic Symbol Spotting on Large-Scale CAD Drawings

URL: http://arxiv.org/abs/2412.07377v2
Date: Wed, 11 Dec 2024 03:27:12 GMT
Title: CADSpotting: Robust Panoptic Symbol Spotting on Large-Scale CAD Drawings
Authors: Jiazuo Mu, Fuyi Yang, Yanshun Zhang, Junxiong Zhang, Yongjian Luo, Lan Xu, Yujiao Shi, Jingyi Yu, Yingliang Zhang,
Abstract summary: We introduce CADSpotting, an efficient method for panoptic symbol spotting in large-scale architectural CAD drawings.<n>Building upon a unified 3D point cloud model for joint semantic, instance, and panoptic segmentation, CADSpotting learns robust feature representations.<n>We introduce a large-scale CAD dataset named LS-CAD to support our experiments.
Score: 42.08585210828114
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce CADSpotting, an efficient method for panoptic symbol spotting in large-scale architectural CAD drawings. Existing approaches struggle with the diversity of symbols, scale variations, and overlapping elements in CAD designs. CADSpotting overcomes these challenges by representing each primitive with dense points instead of a single primitive point, described by essential attributes like coordinates and color. Building upon a unified 3D point cloud model for joint semantic, instance, and panoptic segmentation, CADSpotting learns robust feature representations. To enable accurate segmentation in large, complex drawings, we further propose a novel Sliding Window Aggregation (SWA) technique, combining weighted voting and Non-Maximum Suppression (NMS). Moreover, we introduce a large-scale CAD dataset named LS-CAD to support our experiments. Each floorplan in LS-CAD has an average coverage of 1,000 square meter(versus 100 square meter in the existing dataset), providing a valuable benchmark for symbol spotting research. Experimental results on FloorPlanCAD and LS-CAD datasets demonstrate that CADSpotting outperforms existing methods, showcasing its robustness and scalability for real-world CAD applications.

Related papers

RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base [112.72361202480154]
We present RAG-6DPose, a retrieval-augmented approach that leverages 3D CAD models as a knowledge base.<n> Experimental results on standard benchmarks and real-world robotic tasks demonstrate the effectiveness and robustness of our approach.
arXiv Detail & Related papers (2025-06-23T17:19:41Z)
CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning [50.867869718716555]
We introduce CReFT-CAD, a two-stage fine-tuning paradigm that first employs a curriculum-driven reinforcement learning stage with difficulty-aware rewards to build reasoning ability steadily.<n>We release TriView2CAD, the first large-scale, open-source benchmark for orthographic projection reasoning.
arXiv Detail & Related papers (2025-05-31T13:52:56Z)
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images [69.7768227804928]
CADCrafter is an image-to-parametric CAD model generation framework that trains solely on synthetic textureless CAD data. We introduce a geometry encoder to accurately capture diverse geometric features. Our approach can robustly handle real unconstrained CAD images, and even generalize to unseen general objects.
arXiv Detail & Related papers (2025-04-07T06:01:35Z)
ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting [31.42708936135226]
ArchCAD-400K is a large-scale CAD dataset consisting of 413,062 chunks from 5538 highly standardized drawings. We present a new baseline model for panoptic symbol spotting, termed Dual-Pathway Symbol Spotter (DPSS)
arXiv Detail & Related papers (2025-03-28T11:40:53Z)
PHT-CAD: Efficient CAD Parametric Primitive Analysis with Progressive Hierarchical Tuning [52.681829043446044]
ParaCAD comprises over 10 million annotated drawings for training and 3,000 real-world industrial drawings with complex topological structures and physical constraints for test. PHT-CAD is a novel 2D PPA framework that harnesses the modality alignment and reasoning capabilities of Vision-Language Models.
arXiv Detail & Related papers (2025-03-23T17:24:32Z)
CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs [15.505120320280007]
This work introduces CAD-GPT, a CAD synthesis method with spatial reasoning-enhanced MLLM. It maps 3D spatial positions and 3D sketch plane rotation angles into a 1D linguistic feature space using a specialized spatial unfolding mechanism. It also discretizes 2D sketch coordinates into an appropriate planar space to enable precise determination of spatial starting position, sketch orientation, and 2D sketch coordinate translations.
arXiv Detail & Related papers (2024-12-27T14:19:36Z)
PS-CAD: Local Geometry Guidance via Prompting and Selection for CAD Reconstruction [86.726941702182]
We introduce geometric guidance into the reconstruction network PS-CAD. We provide the geometry of surfaces where the current reconstruction differs from the complete model as a point cloud. Second, we use geometric analysis to extract a set of planar prompts, that correspond to candidate surfaces.
arXiv Detail & Related papers (2024-05-24T03:43:55Z)
Pixel-Wise Symbol Spotting via Progressive Points Location for Parsing CAD Images [1.5736099356327244]
We propose to label and spot symbols from CAD images that are converted from CAD drawings. The advantage of spotting symbols from CAD images lies in the low requirement of labelers and the low-cost annotation. Based on the keypoints detection, we propose a symbol grouping method to redraw the rectangle symbols in CAD images.
arXiv Detail & Related papers (2024-04-17T01:35:52Z)
Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space. We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z)
Point2CAD: Reverse Engineering CAD Models from 3D Point Clouds [26.10631058349939]
We propose a hybrid analytic-neural reconstruction scheme that bridges the gap between segmented point clouds and structured CAD models. We also propose a novel implicit neural representation of freeform surfaces, driving up the performance of our overall CAD reconstruction scheme.
arXiv Detail & Related papers (2023-12-07T08:23:44Z)
SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving [87.8761593366609]
SSCBench is a benchmark that integrates scenes from widely used automotive datasets. We benchmark models using monocular, trinocular, and cloud input to assess the performance gap. We have unified semantic labels across diverse datasets to simplify cross-domain generalization testing.
arXiv Detail & Related papers (2023-06-15T09:56:33Z)
Circular Accessible Depth: A Robust Traversability Representation for UGV Navigation [21.559882149457895]
Circular Accessible Depth (CAD) is a robust traversability representation for an unmanned ground vehicle (UGV) We propose a neural network, namely CADNet, with an attention-based multi-frame point cloud fusion module to encode the spatial features from point clouds captured by LiDAR.
arXiv Detail & Related papers (2022-12-28T03:13:32Z)
Weakly-Supervised End-to-End CAD Retrieval to Scan Objects [25.41908065938424]
We propose a new weakly-supervised approach to retrieve semantically and structurally similar CAD models to a query 3D scanned scene. Our approach leverages a fully-differentiable top-$k$ retrieval layer, enabling end-to-end training guided by geometric and perceptual similarity of the top retrieved CAD models to the scan queries.
arXiv Detail & Related papers (2022-03-24T06:30:47Z)
Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval from a Single Image [58.953160501596805]
We propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion. Our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
arXiv Detail & Related papers (2021-08-20T20:58:52Z)
FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol Spotting [38.987494792258694]
We present FloorPlanCAD, a large-scale real-world CAD drawing dataset containing over 10,000 floor plans. We propose a novel method by combining Graph Convolutional Networks (GCNs) with Convolutional Neural Networks (CNNs) The proposed CNN-GCN method achieved state-of-the-art (SOTA) performance on the task of semantic symbol spotting.
arXiv Detail & Related papers (2021-05-15T06:01:11Z)
Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve [54.054575408582565]
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image. We present Mask2CAD, which jointly detects objects in real-world images and for each detected object, optimize for the most similar CAD model and its pose. This produces a clean, lightweight representation of the objects in an image.
arXiv Detail & Related papers (2020-07-26T00:08:37Z)
CAD-Deform: Deformable Fitting of CAD Models to 3D Scans [30.451330075135076]
We introduce CAD-Deform, a method which obtains more accurate CAD-to-scan fits by non-rigidly deforming retrieved CAD models. A series of experiments demonstrate that our method achieves significantly tighter scan-to-CAD fits, allowing a more accurate digital replica of the scanned real-world environment.
arXiv Detail & Related papers (2020-07-23T12:30:20Z)
Symbol Spotting on Digital Architectural Floor Plans Using a Deep Learning-based Framework [76.70609932823149]
This paper focuses on symbol spotting on real-world digital architectural floor plans with a deep learning (DL)-based framework. We propose a training strategy based on tiles, avoiding many issues particular to DL-based object detection networks. Experiments on real-world floor plans demonstrate that our method successfully detects architectural symbols with low intra-class similarity and of variable graphical complexity.
arXiv Detail & Related papers (2020-06-01T03:16:05Z)
SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans [24.06640371472068]
We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene.
arXiv Detail & Related papers (2020-03-27T20:17:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.