FloorSAM: SAM-Guided Floorplan Reconstruction with Semantic-Geometric Fusion
- URL: http://arxiv.org/abs/2509.15750v1
- Date: Fri, 19 Sep 2025 08:27:10 GMT
- Title: FloorSAM: SAM-Guided Floorplan Reconstruction with Semantic-Geometric Fusion
- Authors: Han Ye, Haofu Wang, Yunchi Zhang, Jiangjian Xiao, Yuqiang Jin, Jinyuan Liu, Wen-An Zhang, Uladzislau Sychou, Alexander Tuzikov, Vladislav Sobolevskii, Valerii Zakharov, Boris Sokolov, Minglei Fu,
- Abstract summary: We propose FloorSAM, a framework that integrates point cloud density maps with the Segment Anything Model (SAM) for accurate floor plan reconstruction from LiDAR data.<n>Using grid-based filtering, adaptive resolution projection, and image enhancement, we create robust top-down density maps.<n>Tests on Gib and ISPRS datasets show better accuracy, recall, and robustness than traditional methods, especially in noisy complex settings.
- Score: 38.44214429214816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstructing building floor plans from point cloud data is key for indoor navigation, BIM, and precise measurements. Traditional methods like geometric algorithms and Mask R-CNN-based deep learning often face issues with noise, limited generalization, and loss of geometric details. We propose FloorSAM, a framework that integrates point cloud density maps with the Segment Anything Model (SAM) for accurate floor plan reconstruction from LiDAR data. Using grid-based filtering, adaptive resolution projection, and image enhancement, we create robust top-down density maps. FloorSAM uses SAM's zero-shot learning for precise room segmentation, improving reconstruction across diverse layouts. Room masks are generated via adaptive prompt points and multistage filtering, followed by joint mask and point cloud analysis for contour extraction and regularization. This produces accurate floor plans and recovers room topological relationships. Tests on Giblayout and ISPRS datasets show better accuracy, recall, and robustness than traditional methods, especially in noisy and complex settings. Code and materials: github.com/Silentbarber/FloorSAM.
Related papers
- Towards Localizing Structural Elements: Merging Geometrical Detection with Semantic Verification in RGB-D Data [0.0]
This paper presents a real-time pipeline for localizing building components, including wall and ground surfaces, by integrating geometric calculations for pure 3D plane detection.
It has a parallel multi-thread architecture to precisely estimate poses and equations of all the planes detected in the environment, filters the ones forming the map structure using a panoptic segmentation validation, and keeps only the validated building components.
It can also ensure (re-)association of these detected components into a unified 3D scene graph, bridging the gap between geometric accuracy and semantic understanding.
arXiv Detail & Related papers (2024-09-10T16:28:09Z) - Unsupervised 3D Point Cloud Completion via Multi-view Adversarial Learning [61.14132533712537]
We propose MAL-UPC, a framework that effectively leverages both region-level and category-specific geometric similarities to complete missing structures.<n>Our MAL-UPC does not require any 3D complete supervision and only necessitates single-view partial observations in the training set.
arXiv Detail & Related papers (2024-07-13T06:53:39Z) - A Hybrid Semantic-Geometric Approach for Clutter-Resistant Floorplan
Generation from Building Point Clouds [2.0859227544921874]
This research proposes a hybrid semantic-geometric approach for clutter-resistant floorplan generation from laser-scanned building point clouds.
The proposed method is evaluated using the metrics of precision, recall, Intersection-over-Union (IOU), Betti error, and warping error.
arXiv Detail & Related papers (2023-05-15T20:08:43Z) - Structure PLP-SLAM: Efficient Sparse Mapping and Localization using
Point, Line and Plane for Monocular, RGB-D and Stereo Cameras [13.693353009049773]
This paper demonstrates a visual SLAM system that utilizes point and line cloud for robust camera localization, simultaneously, with an embedded piece-wise planar reconstruction (PPR) module.
We address the challenge of reconstructing geometric primitives with scale ambiguity by proposing several run-time optimizations on the reconstructed lines and planes.
The results show that our proposed SLAM tightly incorporates the semantic features to boost both tracking as well as backend optimization.
arXiv Detail & Related papers (2022-07-13T09:05:35Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Efficient 3D Deep LiDAR Odometry [16.388259779644553]
An efficient 3D point cloud learning architecture, named PWCLO-Net, is first proposed in this paper.
The entire architecture is holistically optimized end-to-end to achieve adaptive learning of cost volume and mask.
arXiv Detail & Related papers (2021-11-03T11:09:49Z) - SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine
Reconstruction with Self-Projection Optimization [52.20602782690776]
It is expensive and tedious to obtain large scale paired sparse-canned point sets for training from real scanned sparse data.
We propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface.
We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods.
arXiv Detail & Related papers (2020-12-08T14:14:09Z) - PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical
Embedding Mask Optimization [17.90299648470637]
A novel 3D point cloud learning model for deep LiDAR odometry, named PWCLO-Net, is proposed in this paper.
In this model, the Pyramid, Warping, and Cost volume structure for the LiDAR odometry task is built to refine the estimated pose in a coarse-to-fine approach hierarchically.
Our method outperforms all recent learning-based methods and outperforms the geometry-based approach, LOAM with mapping optimization, on most sequences of KITTI odometry dataset.
arXiv Detail & Related papers (2020-12-02T05:23:41Z) - Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency
Checking [54.58791377183574]
Our novel hybrid recurrent multi-view stereo net consists of two core modules: 1) a light DRENet (Dense Reception Expanded) module to extract dense feature maps of original size with multi-scale context information, 2) a HU-LSTM (Hybrid U-LSTM) to regularize 3D matching volume into predicted depth map.
Our method exhibits competitive performance to the state-of-the-art method while dramatically reduces memory consumption, which costs only $19.4%$ of R-MVSNet memory consumption.
arXiv Detail & Related papers (2020-07-21T14:59:59Z) - Refined Plane Segmentation for Cuboid-Shaped Objects by Leveraging Edge
Detection [63.942632088208505]
We propose a post-processing algorithm to align the segmented plane masks with edges detected in the image.
This allows us to increase the accuracy of state-of-the-art approaches, while limiting ourselves to cuboid-shaped objects.
arXiv Detail & Related papers (2020-03-28T18:51:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.