Related papers: Co-PLNet: A Collaborative Point-Line Network for Prompt-Guided Wireframe Parsing

Co-PLNet: A Collaborative Point-Line Network for Prompt-Guided Wireframe Parsing

URL: http://arxiv.org/abs/2601.18252v1
Date: Mon, 26 Jan 2026 08:16:02 GMT
Title: Co-PLNet: A Collaborative Point-Line Network for Prompt-Guided Wireframe Parsing
Authors: Chao Wang, Xuanying Li, Cheng Dai, Jinglei Feng, Yuxiang Luo, Yuqi Ouyang, Hao Qin,
Abstract summary: Existing methods predict lines and junctions separately and reconcile them post-hoc, causing mismatches and reduced robustness.<n>We present Co-PLNet, a point-line collaborative framework that exchanges spatial cues between the two tasks.<n>Experiments on Wireframe and YorkUrban show consistent improvements in accuracy and robustness, together with favorable real-time efficiency.
Score: 5.175452394220525
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Wireframe parsing aims to recover line segments and their junctions to form a structured geometric representation useful for downstream tasks such as Simultaneous Localization and Mapping (SLAM). Existing methods predict lines and junctions separately and reconcile them post-hoc, causing mismatches and reduced robustness. We present Co-PLNet, a point-line collaborative framework that exchanges spatial cues between the two tasks, where early detections are converted into spatial prompts via a Point-Line Prompt Encoder (PLP-Encoder), which encodes geometric attributes into compact and spatially aligned maps. A Cross-Guidance Line Decoder (CGL-Decoder) then refines predictions with sparse attention conditioned on complementary prompts, enforcing point-line consistency and efficiency. Experiments on Wireframe and YorkUrban show consistent improvements in accuracy and robustness, together with favorable real-time efficiency, demonstrating our effectiveness for structured geometry perception.

Related papers

ICP-4D: Bridging Iterative Closest Point and LiDAR Panoptic Segmentation [44.68614934602709]
ICP-4D is a training-free framework that unifies spatial and temporal reasoning through geometric relations among instance-level point sets.<n>To stabilize association under noisy instance predictions, we introduce a Sinkhorn-based soft matching.<n>Our experiments across both SemanticKITTI and panoptic nuScenes demonstrate that our method consistently outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2025-12-22T03:13:08Z)
PVINet: Point-Voxel Interlaced Network for Point Cloud Compression [83.74785652597248]
In point cloud compression, the quality of a reconstructed point cloud relies on both the global structure and the local context.<n>We propose a point-voxel interlaced network (PVINet), which captures global structural features and local contextual features in parallel.<n>PVINet delivers competitive performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-09-01T03:37:32Z)
Enhancing point cloud analysis via neighbor aggregation correction based on cross-stage structure correlation [22.48120946682699]
Point cloud analysis is a cornerstone of many downstream tasks, among which aggregating local structures is the basis for understanding point cloud data.<n>We propose the Point Distribution Set Abstraction module (PDSA) that utilizes the correlation in the high-dimensional space to correct the feature distribution during aggregation.<n>PDSA distinguishes the point correlation based on a lightweight cross-stage structural descriptor, and enhances structural homogeneity.
arXiv Detail & Related papers (2025-06-18T06:08:17Z)
Hierarchical Attention Networks for Lossless Point Cloud Attribute Compression [22.234604407822673]
We propose a deep hierarchical attention context model for attribute compression of point clouds.<n>A simple and effective Level of Detail (LoD) structure is introduced to yield a coarse-to-fine representation.<n>Points within the same refinement level are encoded in parallel, sharing a common context point group.
arXiv Detail & Related papers (2025-04-01T07:14:10Z)
Fully-Geometric Cross-Attention for Point Cloud Registration [51.865371511201765]
Point cloud registration approaches often fail when the overlap between point clouds is low due to noisy point correspondences.<n>This work introduces a novel cross-attention mechanism tailored for Transformer-based architectures that tackles this problem.<n>We integrate the Gromov-Wasserstein distance into the cross-attention formulation to jointly compute distances between points across different point clouds.<n>At the point level, we also devise a self-attention mechanism that aggregates the local geometric structure information into point features for fine matching.
arXiv Detail & Related papers (2025-02-12T10:44:36Z)
Local All-Pair Correspondence for Point Tracking [59.76186266230608]
We introduce LocoTrack, a highly accurate and efficient model designed for the task of tracking any point (TAP) across video sequences. LocoTrack achieves unmatched accuracy on all TAP-Vid benchmarks and operates at a speed almost 6 times faster than the current state-of-the-art.
arXiv Detail & Related papers (2024-07-22T06:49:56Z)
COMO: Compact Mapping and Odometry [17.71754144808295]
We present COMO, a real-time monocular mapping and odometry system that encodes dense geometry via a compact set of 3D anchor points. The representation enables joint optimization of camera poses and dense geometry, intrinsic 3D consistency, and efficient second-order inference.
arXiv Detail & Related papers (2024-04-04T15:35:43Z)
PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds [79.99653758293277]
PCAM is a neural network whose key element is a pointwise product of cross-attention matrices. We show that PCAM achieves state-of-the-art results among methods which, like us, solve steps (a) and (b) jointly via deepnets.
arXiv Detail & Related papers (2021-10-04T09:23:27Z)
ELSD: Efficient Line Segment Detector and Descriptor [9.64386089593887]
We present the novel Efficient Line Segment Detector and Descriptor (ELSD) to simultaneously detect line segments and extract their descriptors in an image. ELSD provides the essential line features to the higher-level tasks like SLAM and image matching in real time. In the experiments, the proposed ELSD achieves the state-of-the-art performance on the Wireframe dataset and YorkUrban dataset.
arXiv Detail & Related papers (2021-04-29T08:53:03Z)
SOLD2: Self-supervised Occlusion-aware Line Description and Detection [95.8719432775724]
We introduce the first joint detection and description of line segments in a single deep network. Our method does not require any annotated line labels and can therefore generalize to any dataset. We evaluate our approach against previous line detection and description methods on several multi-view datasets.
arXiv Detail & Related papers (2021-04-07T19:27:17Z)
Holistically-Attracted Wireframe Parsing [123.58263152571952]
This paper presents a fast and parsimonious parsing method to detect a vectorized wireframe in an input image with a single forward pass. The proposed method is end-to-end trainable, consisting of three components: (i) line segment and junction proposal generation, (ii) line segment and junction matching, and (iii) line segment and junction verification.
arXiv Detail & Related papers (2020-03-03T17:43:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.