Related papers: 3D Can Be Explored In 2D: Pseudo-Label Generation for LiDAR Point Clouds Using Sensor-Intensity-Based 2D Semantic Segmentation

3D Can Be Explored In 2D: Pseudo-Label Generation for LiDAR Point Clouds Using Sensor-Intensity-Based 2D Semantic Segmentation

URL: http://arxiv.org/abs/2505.03300v1
Date: Tue, 06 May 2025 08:31:32 GMT
Title: 3D Can Be Explored In 2D: Pseudo-Label Generation for LiDAR Point Clouds Using Sensor-Intensity-Based 2D Semantic Segmentation
Authors: Andrew Caunes, Thierry Chateau, Vincent Frémont,
Abstract summary: We introduce a new 3D semantic segmentation pipeline that leverages aligned scenes and state-of-the-art 2D segmentation methods.<n>Our approach generates 2D views from LiDAR scans colored by sensor intensity and applies 2D semantic segmentation to these views.<n>The segmented 2D outputs are then back-projected onto the 3D points, with a simple voting-based estimator.
Score: 3.192308005611312
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Semantic segmentation of 3D LiDAR point clouds, essential for autonomous driving and infrastructure management, is best achieved by supervised learning, which demands extensive annotated datasets and faces the problem of domain shifts. We introduce a new 3D semantic segmentation pipeline that leverages aligned scenes and state-of-the-art 2D segmentation methods, avoiding the need for direct 3D annotation or reliance on additional modalities such as camera images at inference time. Our approach generates 2D views from LiDAR scans colored by sensor intensity and applies 2D semantic segmentation to these views using a camera-domain pretrained model. The segmented 2D outputs are then back-projected onto the 3D points, with a simple voting-based estimator that merges the labels associated to each 3D point. Our main contribution is a global pipeline for 3D semantic segmentation requiring no prior 3D annotation and not other modality for inference, which can be used for pseudo-label generation. We conduct a thorough ablation study and demonstrate the potential of the generated pseudo-labels for the Unsupervised Domain Adaptation task.

Related papers

Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance [72.6809373191638]
We propose a framework to study how to leverage constraints between 2D and 3D domains without requiring any 3D labels. Specifically, we design a feature-level constraint to align LiDAR and image features based on object-aware regions. Second, the output-level constraint is developed to enforce the overlap between 2D and projected 3D box estimations. Third, the training-level constraint is utilized by producing accurate and consistent 3D pseudo-labels that align with the visual data.
arXiv Detail & Related papers (2023-12-12T18:57:25Z)
DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations. We leverage the strong semantic prior within a 3D generative model to train a semantic decoder. Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z)
Cross-modal & Cross-domain Learning for Unsupervised LiDAR Semantic Segmentation [82.47872784972861]
Cross-modal domain adaptation has been studied on the paired 2D image and 3D LiDAR data to ease the labeling costs for 3D LiDAR semantic segmentation (3DLSS) in the target domain. This paper studies a new 3DLSS setting where a 2D dataset with semantic annotations and a paired but unannotated 2D image and 3D LiDAR data (target) are available. To achieve 3DLSS in this scenario, we propose Cross-Modal and Cross-Domain Learning (CoMoDaL)
arXiv Detail & Related papers (2023-08-05T14:00:05Z)
Multi-View Representation is What You Need for Point-Cloud Pre-Training [22.55455166875263]
This paper proposes a novel approach to point-cloud pre-training that learns 3D representations by leveraging pre-trained 2D networks. We train the 3D feature extraction network with the help of the novel 2D knowledge transfer loss. Experimental results demonstrate that our pre-trained model can be successfully transferred to various downstream tasks.
arXiv Detail & Related papers (2023-06-05T03:14:54Z)
Unleash the Potential of Image Branch for Cross-modal 3D Object Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects. First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation. Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z)
LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for Autonomous Driving [34.119642131912485]
We present a more artful framework, LiDAR-guided Weakly Supervised Instance (LWSIS) LWSIS uses the off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models. Our LWSIS not only exploits the complementary information in multimodal data during training, but also significantly reduces the cost of the dense 2D masks.
arXiv Detail & Related papers (2022-12-07T08:08:01Z)
Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation [48.677336052620895]
We present a novel 3D-to-2D label transfer method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and instance labels. By inferring in 3D space and rendering to 2D labels, our 2D semantic and instance labels are multi-view consistent by design.
arXiv Detail & Related papers (2022-03-29T04:16:40Z)
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution. A natural remedy is to utilize the 3D voxelization and 3D convolution network. We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z)
FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection [81.79171905308827]
We propose frustum-aware geometric reasoning (FGR) to detect vehicles in point clouds without any 3D annotations. Our method consists of two stages: coarse 3D segmentation and 3D bounding box estimation. It is able to accurately detect objects in 3D space with only 2D bounding boxes and sparse point clouds.
arXiv Detail & Related papers (2021-05-17T07:29:55Z)
3D Guided Weakly Supervised Semantic Segmentation [27.269847900950943]
We propose a weakly supervised 2D semantic segmentation model by incorporating sparse bounding box labels with available 3D information. We manually labeled a subset of the 2D-3D Semantics(2D-3D-S) dataset with bounding boxes, and introduce our 2D-3D inference module to generate accurate pixel-wise segment proposal masks.
arXiv Detail & Related papers (2020-12-01T03:34:15Z)
Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space. A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space. We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.