CADSpotting: Robust Panoptic Symbol Spotting on Large-Scale CAD Drawings
- URL: http://arxiv.org/abs/2412.07377v3
- Date: Thu, 13 Mar 2025 07:41:50 GMT
- Title: CADSpotting: Robust Panoptic Symbol Spotting on Large-Scale CAD Drawings
- Authors: Jiazuo Mu, Fuyi Yang, Yanshun Zhang, Mingqian Zhang, Junxiong Zhang, Yongjian Luo, Lan Xu, Yujiao Shi, Yingliang Zhang,
- Abstract summary: CADSpotting represents primitives through densely sampled points with attributes like coordinates and colors.<n>We propose a novel Sliding Window Aggregation (SWA) technique, combining weighted voting and Non-Maximum Suppression (NMS)<n> Experiments on FloorPlanCAD and LS-CAD datasets show that CADSpotting significantly outperforms existing methods.
- Score: 21.512025767558498
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce CADSpotting, an effective method for panoptic symbol spotting in large-scale architectural CAD drawings. Existing approaches struggle with symbol diversity, scale variations, and overlapping elements in CAD designs. CADSpotting overcomes these challenges by representing primitives through densely sampled points with attributes like coordinates and colors, using a unified 3D point cloud model for robust feature learning. To enable accurate segmentation in large, complex drawings, we further propose a novel Sliding Window Aggregation (SWA) technique, combining weighted voting and Non-Maximum Suppression (NMS). Moreover, we introduce LS-CAD, a new large-scale CAD dataset to support our experiments, with each floorplan covering around 1,000 square meters, significantly larger than previous benchmarks. Experiments on FloorPlanCAD and LS-CAD datasets show that CADSpotting significantly outperforms existing methods. We also demonstrate its practical value through automating parametric 3D reconstruction, enabling interior modeling directly from raw CAD inputs.
Related papers
- CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images [69.7768227804928]
CADCrafter is an image-to-parametric CAD model generation framework that trains solely on synthetic textureless CAD data.
We introduce a geometry encoder to accurately capture diverse geometric features.
Our approach can robustly handle real unconstrained CAD images, and even generalize to unseen general objects.
arXiv Detail & Related papers (2025-04-07T06:01:35Z) - ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting [31.42708936135226]
ArchCAD-400K is a large-scale CAD dataset consisting of 413,062 chunks from 5538 highly standardized drawings.
We present a new baseline model for panoptic symbol spotting, termed Dual-Pathway Symbol Spotter (DPSS)
arXiv Detail & Related papers (2025-03-28T11:40:53Z) - PHT-CAD: Efficient CAD Parametric Primitive Analysis with Progressive Hierarchical Tuning [52.681829043446044]
ParaCAD comprises over 10 million annotated drawings for training and 3,000 real-world industrial drawings with complex topological structures and physical constraints for test.
PHT-CAD is a novel 2D PPA framework that harnesses the modality alignment and reasoning capabilities of Vision-Language Models.
arXiv Detail & Related papers (2025-03-23T17:24:32Z) - CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs [15.505120320280007]
This work introduces CAD-GPT, a CAD synthesis method with spatial reasoning-enhanced MLLM.
It maps 3D spatial positions and 3D sketch plane rotation angles into a 1D linguistic feature space using a specialized spatial unfolding mechanism.
It also discretizes 2D sketch coordinates into an appropriate planar space to enable precise determination of spatial starting position, sketch orientation, and 2D sketch coordinate translations.
arXiv Detail & Related papers (2024-12-27T14:19:36Z) - PS-CAD: Local Geometry Guidance via Prompting and Selection for CAD Reconstruction [86.726941702182]
We introduce geometric guidance into the reconstruction network PS-CAD.
We provide the geometry of surfaces where the current reconstruction differs from the complete model as a point cloud.
Second, we use geometric analysis to extract a set of planar prompts, that correspond to candidate surfaces.
arXiv Detail & Related papers (2024-05-24T03:43:55Z) - Pixel-Wise Symbol Spotting via Progressive Points Location for Parsing CAD Images [1.5736099356327244]
We propose to label and spot symbols from CAD images that are converted from CAD drawings.
The advantage of spotting symbols from CAD images lies in the low requirement of labelers and the low-cost annotation.
Based on the keypoints detection, we propose a symbol grouping method to redraw the rectangle symbols in CAD images.
arXiv Detail & Related papers (2024-04-17T01:35:52Z) - Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space.
We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z) - Point2CAD: Reverse Engineering CAD Models from 3D Point Clouds [26.10631058349939]
We propose a hybrid analytic-neural reconstruction scheme that bridges the gap between segmented point clouds and structured CAD models.
We also propose a novel implicit neural representation of freeform surfaces, driving up the performance of our overall CAD reconstruction scheme.
arXiv Detail & Related papers (2023-12-07T08:23:44Z) - SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving [87.8761593366609]
SSCBench is a benchmark that integrates scenes from widely used automotive datasets.
We benchmark models using monocular, trinocular, and cloud input to assess the performance gap.
We have unified semantic labels across diverse datasets to simplify cross-domain generalization testing.
arXiv Detail & Related papers (2023-06-15T09:56:33Z) - Circular Accessible Depth: A Robust Traversability Representation for
UGV Navigation [21.559882149457895]
Circular Accessible Depth (CAD) is a robust traversability representation for an unmanned ground vehicle (UGV)
We propose a neural network, namely CADNet, with an attention-based multi-frame point cloud fusion module to encode the spatial features from point clouds captured by LiDAR.
arXiv Detail & Related papers (2022-12-28T03:13:32Z) - Weakly-Supervised End-to-End CAD Retrieval to Scan Objects [25.41908065938424]
We propose a new weakly-supervised approach to retrieve semantically and structurally similar CAD models to a query 3D scanned scene.
Our approach leverages a fully-differentiable top-$k$ retrieval layer, enabling end-to-end training guided by geometric and perceptual similarity of the top retrieved CAD models to the scan queries.
arXiv Detail & Related papers (2022-03-24T06:30:47Z) - Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval
from a Single Image [58.953160501596805]
We propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion.
Our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
arXiv Detail & Related papers (2021-08-20T20:58:52Z) - FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol
Spotting [38.987494792258694]
We present FloorPlanCAD, a large-scale real-world CAD drawing dataset containing over 10,000 floor plans.
We propose a novel method by combining Graph Convolutional Networks (GCNs) with Convolutional Neural Networks (CNNs)
The proposed CNN-GCN method achieved state-of-the-art (SOTA) performance on the task of semantic symbol spotting.
arXiv Detail & Related papers (2021-05-15T06:01:11Z) - Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve [54.054575408582565]
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image.
We present Mask2CAD, which jointly detects objects in real-world images and for each detected object, optimize for the most similar CAD model and its pose.
This produces a clean, lightweight representation of the objects in an image.
arXiv Detail & Related papers (2020-07-26T00:08:37Z) - CAD-Deform: Deformable Fitting of CAD Models to 3D Scans [30.451330075135076]
We introduce CAD-Deform, a method which obtains more accurate CAD-to-scan fits by non-rigidly deforming retrieved CAD models.
A series of experiments demonstrate that our method achieves significantly tighter scan-to-CAD fits, allowing a more accurate digital replica of the scanned real-world environment.
arXiv Detail & Related papers (2020-07-23T12:30:20Z) - Symbol Spotting on Digital Architectural Floor Plans Using a Deep
Learning-based Framework [76.70609932823149]
This paper focuses on symbol spotting on real-world digital architectural floor plans with a deep learning (DL)-based framework.
We propose a training strategy based on tiles, avoiding many issues particular to DL-based object detection networks.
Experiments on real-world floor plans demonstrate that our method successfully detects architectural symbols with low intra-class similarity and of variable graphical complexity.
arXiv Detail & Related papers (2020-06-01T03:16:05Z) - SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans [24.06640371472068]
We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors.
Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene.
arXiv Detail & Related papers (2020-03-27T20:17:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.