Text2LiDAR: Text-guided LiDAR Point Cloud Generation via Equirectangular Transformer
- URL: http://arxiv.org/abs/2407.19628v1
- Date: Mon, 29 Jul 2024 01:18:47 GMT
- Title: Text2LiDAR: Text-guided LiDAR Point Cloud Generation via Equirectangular Transformer
- Authors: Yang Wu, Kaihua Zhang, Jianjun Qian, Jin Xie, Jian Yang,
- Abstract summary: We propose Text2LiDAR, the first efficient, diverse, and text-controllable LiDAR data generation model.
We design an equirectangular transformer architecture, utilizing the designed equirectangular attention to capture LiDAR features.
We construct nuLiDARtext which offers diverse text descriptors for 34,149 LiDAR point clouds from 850 scenes.
- Score: 38.18396501696647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The complex traffic environment and various weather conditions make the collection of LiDAR data expensive and challenging. Achieving high-quality and controllable LiDAR data generation is urgently needed, controlling with text is a common practice, but there is little research in this field. To this end, we propose Text2LiDAR, the first efficient, diverse, and text-controllable LiDAR data generation model. Specifically, we design an equirectangular transformer architecture, utilizing the designed equirectangular attention to capture LiDAR features in a manner with data characteristics. Then, we design a control-signal embedding injector to efficiently integrate control signals through the global-to-focused attention mechanism. Additionally, we devise a frequency modulator to assist the model in recovering high-frequency details, ensuring the clarity of the generated point cloud. To foster development in the field and optimize text-controlled generation performance, we construct nuLiDARtext which offers diverse text descriptors for 34,149 LiDAR point clouds from 850 scenes. Experiments on uncontrolled and text-controlled generation in various forms on KITTI-360 and nuScenes datasets demonstrate the superiority of our approach.
Related papers
- LiDARDraft: Generating LiDAR Point Cloud from Versatile Inputs [16.062937048950946]
We propose LiDARDraft to generate realistic and diverse LiDAR point clouds.<n>The 3D layout can be trivially generated from various user inputs.<n>We employ a rangemap-based ControlNet to guide LiDAR point cloud generation.
arXiv Detail & Related papers (2025-12-23T07:03:31Z) - A Self-Conditioned Representation Guided Diffusion Model for Realistic Text-to-LiDAR Scene Generation [41.43267776407459]
Text-to-LiDAR generation can customize 3D data with rich structures and diverse scenes for downstream tasks.<n>However, the scarcity of Text-LiDAR pairs often causes insufficient training priors, generating overly smooth 3D scenes.<n>We propose a Text-to-LiDAR Diffusion Model for scene generation, named T2LDM, with a Self-Conditioned Representation Guidance (SCRG)
arXiv Detail & Related papers (2025-11-24T11:32:15Z) - Learning to Generate 4D LiDAR Sequences [28.411253849111755]
We present LiDARCrafter, a unified framework that converts free-form language into editable LiDAR sequences.<n>LiDARCrafter achieves state-of-the-art fidelity, controllability, and temporal consistency, offering a foundation for LiDAR-based simulation and data augmentation.
arXiv Detail & Related papers (2025-09-15T14:14:48Z) - LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences [10.426609103049572]
LiDARCrafter is a unified framework for 4D LiDAR generation and editing.<n>It achieves state-of-the-art performance in fidelity, controllability, and temporal consistency across all levels.<n>The code and benchmark are released to the community.
arXiv Detail & Related papers (2025-08-05T17:59:56Z) - La La LiDAR: Large-Scale Layout Generation from LiDAR Data [45.5317990948996]
Controllable generation of realistic LiDAR scenes is crucial for applications such as autonomous driving and robotics.<n>We propose Large-scale Layout-guided LiDAR generation model ("La La LiDAR"), a novel layout-guided generative framework.<n>La La LiDAR achieves state-of-the-art performance in both LiDAR generation and downstream perception tasks.
arXiv Detail & Related papers (2025-08-05T17:59:55Z) - WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion [39.36578688743474]
3D scene perception demands a large amount of adverse-weather LiDAR data.
Yet, the cost of LiDAR data collection presents a significant scaling-up challenge.
This paper presents WeatherGen, the first unified diverse-weather LiDAR data diffusion generation framework.
arXiv Detail & Related papers (2025-04-18T09:01:07Z) - GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting [3.376357029373187]
GS-LiDAR is a novel framework for generating realistic LiDAR point clouds with panoramic Gaussian splatting.
We introduce a novel panoramic rendering technique with explicit ray-splat intersection, guided by panoramic LiDAR supervision.
arXiv Detail & Related papers (2025-01-22T11:21:20Z) - Fast LiDAR Data Generation with Rectified Flows [3.297182592932918]
We present R2Flow, a fast and high-fidelity generative model for LiDAR data.
Our method is based on rectified flows that learn straight trajectories.
We also propose an efficient Transformer-based model architecture for processing the image representation of LiDAR range and reflectance measurements.
arXiv Detail & Related papers (2024-12-03T08:10:53Z) - LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
We present LiDAR-GS, a real-time, high-fidelity re-simulation of LiDAR scans in public urban road scenes.
The method achieves state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z) - How Control Information Influences Multilingual Text Image Generation and Editing? [28.999640376365335]
We investigate the role of control information in generating high-quality text.
We propose TextGen, a novel framework designed to enhance generation quality by optimizing control information.
Our method achieves state-of-the-art performance in both Chinese and English text generation.
arXiv Detail & Related papers (2024-07-16T08:40:21Z) - Generative AI Empowered LiDAR Point Cloud Generation with Multimodal Transformer [10.728362890819392]
Integrated sensing and communications is a key enabler for the 6G wireless communication systems.
This paper proposes a novel approach to enhance the wireless communication systems by synthesizing LiDAR point clouds from images and RADAR data.
arXiv Detail & Related papers (2024-05-20T04:15:08Z) - Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving [58.16024314532443]
We introduce LaserMix++, a framework that integrates laser beam manipulations from disparate LiDAR scans and incorporates LiDAR-camera correspondences to assist data-efficient learning.
Results demonstrate that LaserMix++ outperforms fully supervised alternatives, achieving comparable accuracy with five times fewer annotations.
This substantial advancement underscores the potential of semi-supervised approaches in reducing the reliance on extensive labeled data in LiDAR-based 3D scene understanding systems.
arXiv Detail & Related papers (2024-05-08T17:59:53Z) - LiFi: Lightweight Controlled Text Generation with Fine-Grained Control
Codes [46.74968005604948]
We present LIFI, which offers a lightweight approach with fine-grained control for controlled text generation.
We evaluate LIFI on two conventional tasks -- sentiment control and topic control -- and one newly proposed task -- stylistic novel writing.
arXiv Detail & Related papers (2024-02-10T11:53:48Z) - Text2Data: Low-Resource Data Generation with Textual Control [104.38011760992637]
Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines.
We propose Text2Data, a novel approach that utilizes unlabeled data to understand the underlying data distribution through an unsupervised diffusion model.
It undergoes controllable finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z) - Fine-grained Controllable Video Generation via Object Appearance and
Context [74.23066823064575]
We propose fine-grained controllable video generation (FACTOR) to achieve detailed control.
FACTOR aims to control objects' appearances and context, including their location and category.
Our method achieves controllability of object appearances without finetuning, which reduces the per-subject optimization efforts for the users.
arXiv Detail & Related papers (2023-12-05T17:47:33Z) - UltraLiDAR: Learning Compact Representations for LiDAR Completion and
Generation [51.443788294845845]
We present UltraLiDAR, a data-driven framework for scene-level LiDAR completion, LiDAR generation, and LiDAR manipulation.
We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds.
By learning a prior over the discrete codebook, we can generate diverse, realistic LiDAR point clouds for self-driving.
arXiv Detail & Related papers (2023-11-02T17:57:03Z) - NeRF-LiDAR: Generating Realistic LiDAR Point Clouds with Neural Radiance
Fields [20.887421720818892]
We present NeRF-LIDAR, a novel LiDAR simulation method that leverages real-world information to generate realistic LIDAR point clouds.
We verify the effectiveness of our NeRF-LiDAR by training different 3D segmentation models on the generated LiDAR point clouds.
arXiv Detail & Related papers (2023-04-28T12:41:28Z) - LiDAR-NeRF: Novel LiDAR View Synthesis via Neural Radiance Fields [112.62936571539232]
We introduce a new task, novel view synthesis for LiDAR sensors.
Traditional model-based LiDAR simulators with style-transfer neural networks can be applied to render novel views.
We use a neural radiance field (NeRF) to facilitate the joint learning of geometry and the attributes of 3D points.
arXiv Detail & Related papers (2023-04-20T15:44:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.