Related papers: Symbol as Points: Panoptic Symbol Spotting via Point-based Representation

Symbol as Points: Panoptic Symbol Spotting via Point-based Representation

URL: http://arxiv.org/abs/2401.10556v1
Date: Fri, 19 Jan 2024 08:44:52 GMT
Title: Symbol as Points: Panoptic Symbol Spotting via Point-based Representation
Authors: Wenlong Liu, Tianyu Yang, Yuhan Wang, Qizhi Yu, Lei Zhang
Abstract summary: This work studies the problem of panoptic symbol spotting in computer-aided design (CAD) drawings. We take a different approach, which treats graphic primitives as a set of 2D points that are locally connected. Specifically, we utilize a point transformer to extract the primitive features and append a mask2former-like spotting head to predict the final output.
Score: 18.61469313164712
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This work studies the problem of panoptic symbol spotting, which is to spot and parse both countable object instances (windows, doors, tables, etc.) and uncountable stuff (wall, railing, etc.) from computer-aided design (CAD) drawings. Existing methods typically involve either rasterizing the vector graphics into images and using image-based methods for symbol spotting, or directly building graphs and using graph neural networks for symbol recognition. In this paper, we take a different approach, which treats graphic primitives as a set of 2D points that are locally connected and use point cloud segmentation methods to tackle it. Specifically, we utilize a point transformer to extract the primitive features and append a mask2former-like spotting head to predict the final output. To better use the local connection information of primitives and enhance their discriminability, we further propose the attention with connection module (ACM) and contrastive connection learning scheme (CCL). Finally, we propose a KNN interpolation mechanism for the mask attention module of the spotting head to better handle primitive mask downsampling, which is primitive-level in contrast to pixel-level for the image. Our approach, named SymPoint, is simple yet effective, outperforming recent state-of-the-art method GAT-CADNet by an absolute increase of 9.6% PQ and 10.4% RQ on the FloorPlanCAD dataset. The source code and models will be available at https://github.com/nicehuster/SymPoint.

Related papers

Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection [57.883265488038134]
We propose a hierarchical graph interaction network termed HGINet for camouflaged object detection. The network is capable of discovering imperceptible objects via effective graph interaction among the hierarchical tokenized features. Our experiments demonstrate the superior performance of HGINet compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-08-27T12:53:25Z)
Pixel-Wise Symbol Spotting via Progressive Points Location for Parsing CAD Images [1.5736099356327244]
We propose to label and spot symbols from CAD images that are converted from CAD drawings. The advantage of spotting symbols from CAD images lies in the low requirement of labelers and the low-cost annotation. Based on the keypoints detection, we propose a symbol grouping method to redraw the rectangle symbols in CAD images.
arXiv Detail & Related papers (2024-04-17T01:35:52Z)
Surface Reconstruction from Point Clouds via Grid-based Intersection Prediction [12.329450385760051]
We propose a novel approach that directly predicts the intersection points between line segment of point pairs and implicit surfaces. Our approach demonstrates state-of-the-art performance on three datasets: ShapeNet, MGN, and ScanNet.
arXiv Detail & Related papers (2024-03-21T02:31:17Z)
Efficient Encoding of Graphics Primitives with Simplex-based Structures [0.8158530638728501]
We propose a simplex-based approach for encoding graphics primitives. In the 2D image fitting task, the proposed method is capable of fitting an image with 9.4% less time compared to the baseline method.
arXiv Detail & Related papers (2023-11-26T21:53:22Z)
What Can Human Sketches Do for Object Detection? [127.67444974452411]
Sketches are highly expressive, inherently capturing subjective and fine-grained visual cues. A sketch-enabled object detection framework detects based on what textityou sketch -- textitthat zebra'' We show an intuitive synergy between foundation models (e.g., CLIP) and existing sketch models build for sketch-based image retrieval (SBIR) In particular, we first perform independent on both sketch branches of an encoder model to build highly generalisable sketch and photo encoders.
arXiv Detail & Related papers (2023-03-27T12:33:23Z)
Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision. A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive. We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z)
GAT-CADNet: Graph Attention Network for Panoptic Symbol Spotting in CAD Drawings [0.0]
Spotting graphical symbols from the computer-aided design (CAD) drawings is essential to many industrial applications. By treating each CAD drawing as a graph, we propose a novel graph attention network GAT-CADNet. The proposed GAT-CADNet is intuitive yet effective and manages to solve the panoptic symbol spotting problem in one consolidated network.
arXiv Detail & Related papers (2022-01-03T13:08:28Z)
Learning Spatial Context with Graph Neural Network for Multi-Person Pose Grouping [71.59494156155309]
Bottom-up approaches for image-based multi-person pose estimation consist of two stages: keypoint detection and grouping. In this work, we formulate the grouping task as a graph partitioning problem, where we learn the affinity matrix with a Graph Neural Network (GNN) The learned geometry-based affinity is further fused with appearance-based affinity to achieve robust keypoint association.
arXiv Detail & Related papers (2021-04-06T09:21:14Z)
Spatiotemporal Graph Neural Network based Mask Reconstruction for Video Object Segmentation [70.97625552643493]
This paper addresses the task of segmenting class-agnostic objects in semi-supervised setting. We propose a novel graph neuralS network (TG-Net) which captures the local contexts by utilizing all proposals.
arXiv Detail & Related papers (2020-12-10T07:57:44Z)
LCD -- Line Clustering and Description for Place Recognition [29.053923938306323]
We introduce a novel learning-based approach to place recognition, using RGB-D cameras and line clusters as visual and geometric features. We present a neural network architecture based on the attention mechanism for frame-wise line clustering. A similar neural network is used for the description of these clusters with a compact embedding of 128 floating point numbers.
arXiv Detail & Related papers (2020-10-21T09:52:47Z)
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment. Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.