SynthmanticLiDAR: A Synthetic Dataset for Semantic Segmentation on LiDAR Imaging
- URL: http://arxiv.org/abs/2501.19035v1
- Date: Fri, 31 Jan 2025 11:09:10 GMT
- Title: SynthmanticLiDAR: A Synthetic Dataset for Semantic Segmentation on LiDAR Imaging
- Authors: Javier Montalvo, Pablo Carballeira, Álvaro García-Martín,
- Abstract summary: We present a modified CARLA simulator designed with LiDAR semantic segmentation in mind.<n>We have generated SynthmanticLiDAR, a synthetic dataset for semantic segmentation on LiDAR imaging.<n>Our results show that incorporating SynthmanticLiDAR into the training process improves the overall performance of tested algorithms.
- Score: 8.193070135759717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation on LiDAR imaging is increasingly gaining attention, as it can provide useful knowledge for perception systems and potential for autonomous driving. However, collecting and labeling real LiDAR data is an expensive and time-consuming task. While datasets such as SemanticKITTI have been manually collected and labeled, the introduction of simulation tools such as CARLA, has enabled the creation of synthetic datasets on demand. In this work, we present a modified CARLA simulator designed with LiDAR semantic segmentation in mind, with new classes, more consistent object labeling with their counterparts from real datasets such as SemanticKITTI, and the possibility to adjust the object class distribution. Using this tool, we have generated SynthmanticLiDAR, a synthetic dataset for semantic segmentation on LiDAR imaging, designed to be similar to SemanticKITTI, and we evaluate its contribution to the training process of different semantic segmentation algorithms by using a naive transfer learning approach. Our results show that incorporating SynthmanticLiDAR into the training process improves the overall performance of tested algorithms, proving the usefulness of our dataset, and therefore, our adapted CARLA simulator. The dataset and simulator are available in https://github.com/vpulab/SynthmanticLiDAR.
Related papers
- Private Training & Data Generation by Clustering Embeddings [74.00687214400021]
Differential privacy (DP) provides a robust framework for protecting individual data.<n>We introduce a novel principled method for DP synthetic image embedding generation.<n> Empirically, a simple two-layer neural network trained on synthetically generated embeddings achieves state-of-the-art (SOTA) classification accuracy.
arXiv Detail & Related papers (2025-06-20T00:17:14Z) - Enhancing Generalization via Sharpness-Aware Trajectory Matching for Dataset Condensation [37.77454972709646]
We introduce Sharpness-Aware Trajectory Matching (SATM), which enhances the generalization capability of learned synthetic datasets.
Our approach is mathematically well-supported and straightforward to implement along with controllable computational overhead.
arXiv Detail & Related papers (2025-02-03T22:30:06Z) - LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes [55.33167217384738]
LiMoE is a framework that integrates the Mixture of Experts (MoE) paradigm into LiDAR data representation learning.<n>Our approach consists of three stages: Image-to-LiDAR Pretraining, Contrastive Mixture Learning (CML), and Semantic Mixture Supervision (SMS)
arXiv Detail & Related papers (2025-01-07T18:59:58Z) - EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language Models [39.347666307218006]
Large language models (LLMs) have demonstrated remarkable in-context learning capabilities across diverse applications.<n>We introduce EPIC, a novel approach that leverages balanced, grouped data samples and consistent formatting with unique variable mapping to guide LLMs in generating accurate synthetic data across all classes, even for imbalanced datasets.
arXiv Detail & Related papers (2024-04-15T17:49:16Z) - Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection? [12.987587227876565]
We investigate the effectiveness of synthetic data in enhancing egocentric hand-object interaction detection.
By leveraging only 10% of real labeled data, we achieve improvements in Overall AP compared to baselines trained exclusively on real data.
arXiv Detail & Related papers (2023-12-05T11:29:00Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - Extracting Semantic Knowledge from GANs with Unsupervised Learning [65.32631025780631]
Generative Adversarial Networks (GANs) encode semantics in feature maps in a linearly separable form.
We propose a novel clustering algorithm, named KLiSH, which leverages the linear separability to cluster GAN's features.
KLiSH succeeds in extracting fine-grained semantics of GANs trained on datasets of various objects.
arXiv Detail & Related papers (2022-11-30T03:18:16Z) - Learning to Simulate Realistic LiDARs [66.7519667383175]
We introduce a pipeline for data-driven simulation of a realistic LiDAR sensor.
We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces.
We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly.
arXiv Detail & Related papers (2022-09-22T13:12:54Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z) - Validation of Simulation-Based Testing: Bypassing Domain Shift with
Label-to-Image Synthesis [9.531148049378672]
We propose a novel framework consisting of a generative label-to-image synthesis model together with different transferability measures.
We validate our approach empirically on a semantic segmentation task on driving scenes.
Although the latter can distinguish between real-life and synthetic tests, in the former we observe surprisingly strong correlations of 0.7 for both cars and pedestrians.
arXiv Detail & Related papers (2021-06-10T07:23:58Z) - Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge
Distillation [28.874162427052905]
We investigate the effectiveness of "arbitrary transfer sets" such as random noise, publicly available synthetic, and natural datasets.
We find surprising effectiveness of using arbitrary data to conduct knowledge distillation when this dataset is "target-class balanced"
arXiv Detail & Related papers (2020-11-18T06:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.