Related papers: Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings

Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings

URL: http://arxiv.org/abs/2505.23395v1
Date: Thu, 29 May 2025 12:33:11 GMT
Title: Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings
Authors: Xingguang Wei, Haomin Wang, Shenglong Ye, Ruifeng Luo, Yanting Zhang, Lixin Gu, Jifeng Dai, Yu Qiao, Wenhai Wang, Hongjie Zhang,
Abstract summary: We study the task of panoptic symbol spotting in computer-aided design (CAD) drawings composed of vector graphical primitives.<n>Existing methods typically rely on imageization, graph construction, or point-based representation.<n>We propose VecFormer, a novel method that addresses these challenges through line-based representation of primitives.
Score: 45.116136045440584
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the task of panoptic symbol spotting, which involves identifying both individual instances of countable things and the semantic regions of uncountable stuff in computer-aided design (CAD) drawings composed of vector graphical primitives. Existing methods typically rely on image rasterization, graph construction, or point-based representation, but these approaches often suffer from high computational costs, limited generality, and loss of geometric structural information. In this paper, we propose VecFormer, a novel method that addresses these challenges through line-based representation of primitives. This design preserves the geometric continuity of the original primitive, enabling more accurate shape representation while maintaining a computation-friendly structure, making it well-suited for vector graphic understanding tasks. To further enhance prediction reliability, we introduce a Branch Fusion Refinement module that effectively integrates instance and semantic predictions, resolving their inconsistencies for more coherent panoptic outputs. Extensive experiments demonstrate that our method establishes a new state-of-the-art, achieving 91.1 PQ, with Stuff-PQ improved by 9.6 and 21.2 points over the second-best results under settings with and without prior information, respectively, highlighting the strong potential of line-based representation as a foundation for vector graphic understanding.

Related papers

"Principal Components" Enable A New Language of Images [79.45806370905775]
We introduce a novel visual tokenization framework that embeds a provable PCA-like structure into the latent token space.<n>Our approach achieves state-of-the-art reconstruction performance and enables better interpretability to align with the human vision system.
arXiv Detail & Related papers (2025-03-11T17:59:41Z)
Enhancing Polygonal Building Segmentation via Oriented Corners [0.3749861135832072]
This paper introduces a novel deep convolutional neural network named OriCornerNet, which directly extracts delineated building polygons from input images. Our approach involves a deep model that predicts building footprint masks, corners, and orientation vectors that indicate directions toward adjacent corners. Performance evaluations conducted on SpaceNet Vegas and CrowdAI-small datasets demonstrate the competitive efficacy of our approach.
arXiv Detail & Related papers (2024-07-17T01:59:06Z)
Graph-level Representation Learning with Joint-Embedding Predictive Architectures [43.89120279424267]
Joint-Embedding Predictive Architectures (JEPAs) have emerged as a novel and powerful technique for self-supervised representation learning.<n>We show that graph-level representations can be effectively modeled using this paradigm by proposing a Graph Joint-Embedding Predictive Architecture (Graph-JEPA)<n>In particular, we employ masked modeling and focus on predicting the latent representations of masked subgraphs starting from the latent representation of a context subgraph.
arXiv Detail & Related papers (2023-09-27T20:42:02Z)
GrannGAN: Graph annotation generative adversarial networks [72.66289932625742]
We consider the problem of modelling high-dimensional distributions and generating new examples of data with complex relational feature structure coherent with a graph skeleton. The model we propose tackles the problem of generating the data features constrained by the specific graph structure of each data point by splitting the task into two phases. In the first it models the distribution of features associated with the nodes of the given graph, in the second it complements the edge features conditionally on the node features.
arXiv Detail & Related papers (2022-12-01T11:49:07Z)
Template based Graph Neural Network with Optimal Transport Distances [11.56532171513328]
Current Graph Neural Networks (GNN) architectures rely on two important components: node features embedding through message passing, and aggregation with a specialized form of pooling. We propose in this work a novel point of view, which places distances to some learnable graph templates at the core of the graph representation. This distance embedding is constructed thanks to an optimal transport distance: the Fused Gromov-Wasserstein (FGW) distance.
arXiv Detail & Related papers (2022-05-31T12:24:01Z)
Harnessing spectral representations for subgraph alignment [15.86857474914914]
We propose a spectral representation for maps that is compact, easy to compute, robust to topological changes, easy to plug into existing pipelines, and is especially effective for subgraph alignment problems. We report for the first time a surprising phenomenon where the partiality arising in the subgraph alignment task is manifested as a special structure of the map coefficients.
arXiv Detail & Related papers (2022-05-30T09:03:28Z)
GAT-CADNet: Graph Attention Network for Panoptic Symbol Spotting in CAD Drawings [0.0]
Spotting graphical symbols from the computer-aided design (CAD) drawings is essential to many industrial applications. By treating each CAD drawing as a graph, we propose a novel graph attention network GAT-CADNet. The proposed GAT-CADNet is intuitive yet effective and manages to solve the panoptic symbol spotting problem in one consolidated network.
arXiv Detail & Related papers (2022-01-03T13:08:28Z)
Dual Geometric Graph Network (DG2N) -- Iterative network for deformable shape alignment [8.325327265120283]
We provide a novel new approach for aligning geometric models using a dual graph structure where local features are mapping probabilities. We report state of the art results on stretchable domains alignment in a rapid and stable solution for meshes and cloud of points.
arXiv Detail & Related papers (2020-11-30T12:03:28Z)
Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning. Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector. We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z)
Primal-Dual Mesh Convolutional Neural Networks [62.165239866312334]
We propose a primal-dual framework drawn from the graph-neural-network literature to triangle meshes. Our method takes features for both edges and faces of a 3D mesh as input and dynamically aggregates them. We provide theoretical insights of our approach using tools from the mesh-simplification literature.
arXiv Detail & Related papers (2020-10-23T14:49:02Z)
Quiver Signal Processing (QSP) [145.6921439353007]
We state the basics for a signal processing framework on quiver representations. We propose a signal processing framework that allows us to handle heterogeneous multidimensional information in networks.
arXiv Detail & Related papers (2020-10-22T08:40:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.