LOGen: Toward Lidar Object Generation by Point Diffusion
- URL: http://arxiv.org/abs/2412.07385v2
- Date: Mon, 10 Mar 2025 13:15:45 GMT
- Title: LOGen: Toward Lidar Object Generation by Point Diffusion
- Authors: Ellington Kirby, Mickael Chen, Renaud Marlet, Nermin Samet,
- Abstract summary: We introduce a novel task: LiDAR object generation, requiring models to produce 3D objects as viewed by a LiDAR scan.<n>We introduce a novel diffusion-based model to produce LiDAR point clouds of dataset objects, including intensity, and with an extensive control of the generation via conditioning information.<n>Our experiments on nuScenes show the quality of our generations measured with new 3D metrics developed to suit LiDAR objects.
- Score: 10.002129602976085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generation of LiDAR scans is a growing area of research with diverse applications to autonomous driving. However, scan generation remains challenging, especially when compared to the rapid advancement of 2D and 3D object generation. We introduce a novel task: LiDAR object generation, requiring models to produce 3D objects as viewed by a LiDAR scan. This task focuses LiDAR scan generation on the most interesting aspect of scenes, the objects, while also benefiting from advancements in 3D object generative methods. We introduce a novel diffusion-based model to produce LiDAR point clouds of dataset objects, including intensity, and with an extensive control of the generation via conditioning information. Our experiments on nuScenes show the quality of our generations measured with new 3D metrics developed to suit LiDAR objects.
Related papers
- La La LiDAR: Large-Scale Layout Generation from LiDAR Data [45.5317990948996]
Controllable generation of realistic LiDAR scenes is crucial for applications such as autonomous driving and robotics.<n>We propose Large-scale Layout-guided LiDAR generation model ("La La LiDAR"), a novel layout-guided generative framework.<n>La La LiDAR achieves state-of-the-art performance in both LiDAR generation and downstream perception tasks.
arXiv Detail & Related papers (2025-08-05T17:59:55Z) - DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation [49.32104127246474]
DriveGEN is a training-free controllable Text-to-Image Diffusion Generation.
It consistently preserves objects with precise 3D geometry across diverse Out-of-Distribution generations.
arXiv Detail & Related papers (2025-03-14T06:35:38Z) - OLiDM: Object-aware LiDAR Diffusion Models for Autonomous Driving [74.06413946934002]
We introduce OLiDM, a novel framework capable of generating high-fidelity LiDAR data at both the object and the scene levels.
OLiDM consists of two pivotal components: the Object-Scene Progressive Generation (OPG) module and the Object Semantic Alignment (OSA) module.
OPG adapts to user-specific prompts to generate desired foreground objects, which are subsequently employed as conditions in scene generation.
OSA aims to rectify the misalignment between foreground objects and background scenes, enhancing the overall quality of the generated objects.
arXiv Detail & Related papers (2024-12-23T02:43:29Z) - Simultaneous Diffusion Sampling for Conditional LiDAR Generation [24.429704313319398]
This paper proposes a novel simultaneous diffusion sampling methodology to generate point clouds conditioned on the 3D structure of the scene.
Our method can produce accurate and geometrically consistent enhancements to point cloud scans, allowing it to outperform existing methods by a large margin in a variety of benchmarks.
arXiv Detail & Related papers (2024-10-15T14:15:04Z) - VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics.
In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z) - Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange [50.45953583802282]
We introduce a novel self-supervised learning (SSL) strategy for point cloud scene understanding.
Our approach leverages both object patterns and contextual cues to produce robust features.
Our experiments demonstrate the superiority of our method over existing SSL techniques.
arXiv Detail & Related papers (2024-04-11T06:39:53Z) - Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem [12.26293873825084]
We propose to leverage pseudo-LiDAR point clouds generated from videos capturing a surround view of miniatures or real-world objects of minor classes.
Our method, called Pseudo Ground Truth Augmentation (PGT-Aug), consists of three main steps: (i) volumetric 3D instance reconstruction using a 2D-to-3D view synthesis model, (ii) object-level domain alignment with LiDAR intensity estimation, and (iii) a hybrid context-aware placement method from ground and map information.
arXiv Detail & Related papers (2024-03-18T08:50:04Z) - Advances in 3D Generation: A Survey [54.95024616672868]
The field of 3D content generation is developing rapidly, enabling the creation of increasingly high-quality and diverse 3D models.
Specifically, we introduce the 3D representations that serve as the backbone for 3D generation.
We provide a comprehensive overview of the rapidly growing literature on generation methods, categorized by the type of algorithmic paradigms.
arXiv Detail & Related papers (2024-01-31T13:06:48Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - 3D Object Detection in LiDAR Point Clouds using Graph Neural Networks [1.8369974607582582]
This research proposes Graph Neural Network (GNN) based framework to learn and identify the objects in the 3D LiDAR point clouds.
GNNs are class of deep learning which learns the patterns and objects based on the principle of graph learning.
arXiv Detail & Related papers (2023-01-29T19:23:01Z) - Learning Object-level Point Augmentor for Semi-supervised 3D Object
Detection [85.170578641966]
We propose an object-level point augmentor (OPA) that performs local transformations for semi-supervised 3D object detection.
In this way, the resultant augmentor is derived to emphasize object instances rather than irrelevant backgrounds.
Experiments on the ScanNet and SUN RGB-D datasets show that the proposed OPA performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2022-12-19T06:56:14Z) - SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric
Action Recognition [35.4163266882568]
We introduce Self-Supervised Learning Over Sets (SOS) to pre-train a generic Objects In Contact (OIC) representation model.
Our OIC significantly boosts the performance of multiple state-of-the-art video classification models.
arXiv Detail & Related papers (2022-04-10T23:27:19Z) - LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object
Detection [96.63947479020631]
In many real-world applications, the LiDAR points used by mass-produced robots and vehicles usually have fewer beams than that in large-scale public datasets.
We propose the LiDAR Distillation to bridge the domain gap induced by different LiDAR beams for 3D object detection.
arXiv Detail & Related papers (2022-03-28T17:59:02Z) - Fusing Local Similarities for Retrieval-based 3D Orientation Estimation
of Unseen Objects [70.49392581592089]
We tackle the task of estimating the 3D orientation of previously-unseen objects from monocular images.
We follow a retrieval-based strategy and prevent the network from learning object-specific features.
Our experiments on the LineMOD, LineMOD-Occluded, and T-LESS datasets show that our method yields a significantly better generalization to unseen objects than previous works.
arXiv Detail & Related papers (2022-03-16T08:53:00Z) - Diversity in deep generative models and generative AI [0.0]
We introduce a kernel-based measure quantization method that can produce new objects from a given target measure by approximating it as a whole.
This ensures a better diversity of the produced objects.
The method is tested on classic machine learning benchmarks.
arXiv Detail & Related papers (2022-02-19T10:52:52Z) - Context Decoupling Augmentation for Weakly Supervised Semantic
Segmentation [53.49821324597837]
Weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years.
We present a Context Decoupling Augmentation ( CDA) method to change the inherent context in which the objects appear.
To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-03-02T15:05:09Z) - Cirrus: A Long-range Bi-pattern LiDAR Dataset [35.87501129332217]
We introduce Cirrus, a new long-range bi-pattern LiDAR public dataset for autonomous driving tasks.
Our platform is equipped with a high-resolution video camera and a pair of LiDAR sensors with a 250-meter effective range.
In Cirrus, eight categories of objects are exhaustively annotated in the LiDAR point clouds for the entire effective range.
arXiv Detail & Related papers (2020-12-05T03:18:31Z) - Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects.
Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity.
We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.