Related papers: RaLD: Generating High-Resolution 3D Radar Point Clouds with Latent Diffusion

RaLD: Generating High-Resolution 3D Radar Point Clouds with Latent Diffusion

URL: http://arxiv.org/abs/2511.07067v1
Date: Mon, 10 Nov 2025 13:03:58 GMT
Title: RaLD: Generating High-Resolution 3D Radar Point Clouds with Latent Diffusion
Authors: Ruijie Zhang, Bixin Zeng, Shengpeng Wang, Fuhui Zhou, Wei Wang,
Abstract summary: Millimeter-wave radar offers a promising sensing modality for autonomous systems thanks to its robustness in adverse conditions and low cost.<n>Its utility is significantly limited by the sparsity and low resolution of radar point clouds, which poses challenges for tasks requiring dense and accurate 3D perception.<n>We introduce RaLD, a framework that bridges this gap by integrating scene-level frustum-based LiDAR autoencoding, order-invariant latent representations, and direct radar spectrum conditioning.
Score: 17.986411213116828
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Millimeter-wave radar offers a promising sensing modality for autonomous systems thanks to its robustness in adverse conditions and low cost. However, its utility is significantly limited by the sparsity and low resolution of radar point clouds, which poses challenges for tasks requiring dense and accurate 3D perception. Despite that recent efforts have shown great potential by exploring generative approaches to address this issue, they often rely on dense voxel representations that are inefficient and struggle to preserve structural detail. To fill this gap, we make the key observation that latent diffusion models (LDMs), though successful in other modalities, have not been effectively leveraged for radar-based 3D generation due to a lack of compatible representations and conditioning strategies. We introduce RaLD, a framework that bridges this gap by integrating scene-level frustum-based LiDAR autoencoding, order-invariant latent representations, and direct radar spectrum conditioning. These insights lead to a more compact and expressive generation process. Experiments show that RaLD produces dense and accurate 3D point clouds from raw radar spectrums, offering a promising solution for robust perception in challenging environments.

Related papers

A Lightweight Model-Driven 4D Radar Framework for Pervasive Human Detection in Harsh Conditions [11.183137089778468]
This paper presents a fully model-driven 4D radar perception framework designed for real-time execution on embedded edge hardware.<n>The framework is evaluated in a dust filled enclosed trailer and in real underground mining tunnels, and in the tested scenarios the radar based detector maintains stable pedestrian identification as camera and LiDAR modalities fail under severe visibility degradation.
arXiv Detail & Related papers (2026-01-19T20:18:51Z)
R$^3$D: Regional-guided Residual Radar Diffusion [8.828730946638103]
We propose R3D, a regional-guided residual radar diffusion framework.<n>Experiments on the ColoRadar dataset demonstrate that R3D outperforms state-of-the-art methods.
arXiv Detail & Related papers (2026-01-10T07:23:44Z)
RadarGen: Automotive Radar Point Cloud Generation from Cameras [64.69976771710057]
We present RadarGen, a diffusion model for synthesizing realistic automotive radar point clouds from multi-view camera imagery.<n>RadarGen adapts efficient image-latent diffusion to the radar domain by representing radar measurements in bird's-eye-view form.<n>We show that RadarGen captures characteristic radar measurement distributions and reduces the gap to perception models trained on real data.
arXiv Detail & Related papers (2025-12-19T18:57:33Z)
REOcc: Camera-Radar Fusion with Radar Feature Enrichment for 3D Occupancy Prediction [18.57887643050248]
REOcc is a camera-radar fusion network designed to enrich radar feature representations for 3D occupancy prediction.<n>Our approach introduces two main components, a Radar Densifier and a Radar Amplifier, which refine radar features by integrating spatial and contextual information.
arXiv Detail & Related papers (2025-11-10T03:23:52Z)
D$^2$GS: Depth-and-Density Guided Gaussian Splatting for Stable and Accurate Sparse-View Reconstruction [73.61056394880733]
3D Gaussian Splatting (3DGS) enables real-time, high-fidelity novel view synthesis (NVS) with explicit 3D representations.<n>We identify two key failure modes under sparse-view conditions: overfitting in regions with excessive Gaussian density near the camera, and underfitting in distant areas with insufficient Gaussian coverage.<n>We propose a unified framework D$2$GS, comprising two key components: a Depth-and-Density Guided Dropout strategy, and a Distance-Aware Fidelity Enhancement module.
arXiv Detail & Related papers (2025-10-09T17:59:49Z)
Real-IAD D3: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection [53.2590751089607]
Real-IAD D3 is a high-precision multimodal dataset that incorporates an additional pseudo3D modality generated through photometric stereo.<n>We introduce an effective approach that integrates RGB, point cloud, and pseudo-3D depth information to leverage the complementary strengths of each modality.<n>Our experiments highlight the importance of these modalities in boosting detection robustness and overall IAD performance.
arXiv Detail & Related papers (2025-04-19T08:05:47Z)
RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection [68.99784784185019]
Poor lighting or adverse weather conditions degrade camera performance.<n>Radar suffers from noise and positional ambiguity.<n>We propose RobuRCDet, a robust object detection model in BEV.
arXiv Detail & Related papers (2025-02-18T17:17:38Z)
UniBEVFusion: Unified Radar-Vision BEVFusion for 3D Object Detection [2.123197540438989]
Many radar-vision fusion models treat radar as a sparse LiDAR, underutilizing radar-specific information. We propose the Radar Depth Lift-Splat-Shoot (RDL) module, which integrates radar-specific data into the depth prediction process. We also introduce a Unified Feature Fusion (UFF) approach that extracts BEV features across different modalities.
arXiv Detail & Related papers (2024-09-23T06:57:27Z)
RangeLDM: Fast Realistic LiDAR Point Cloud Generation [12.868053836790194]
We introduce RangeLDM, a novel approach for rapidly generating high-quality range-view LiDAR point clouds. We achieve this by correcting range-view data distribution for accurate projection from point clouds to range images via Hough voting. We instruct the model to preserve 3D structural fidelity by devising a range-guided discriminator.
arXiv Detail & Related papers (2024-03-15T08:19:57Z)
Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection [78.59426158981108]
We introduce a bi-directional LiDAR-Radar fusion framework, termed Bi-LRFusion, to tackle the challenges and improve 3D detection for dynamic objects. We conduct extensive experiments on nuScenes and ORR datasets, and show that our Bi-LRFusion achieves state-of-the-art performance for detecting dynamic objects.
arXiv Detail & Related papers (2023-06-02T10:57:41Z)
On Robust Cross-View Consistency in Self-Supervised Monocular Depth Estimation [56.97699793236174]
We study two kinds of robust cross-view consistency in this paper. We exploit the temporal coherence in both depth feature space and 3D voxel space for self-supervised monocular depth estimation. Experimental results on several outdoor benchmarks show that our method outperforms current state-of-the-art techniques.
arXiv Detail & Related papers (2022-09-19T03:46:13Z)
Pointillism: Accurate 3D bounding box estimation with multi-radars [6.59119432945925]
We introduce Pointillism, a system that combines data from multiple spatially separated radars with an optimal separation to mitigate these problems. We present the design of RP-net, a novel deep learning architecture, designed explicitly for radar's sparse data distribution, to enable accurate 3D bounding box estimation.
arXiv Detail & Related papers (2022-03-08T23:09:58Z)
Reconfigurable Voxels: A New Representation for LiDAR-Based Point Clouds [76.52448276587707]
We propose Reconfigurable Voxels, a new approach to constructing representations from 3D point clouds. Specifically, we devise a biased random walk scheme, which adaptively covers each neighborhood with a fixed number of voxels. We find that this approach effectively improves the stability of voxel features, especially for sparse regions.
arXiv Detail & Related papers (2020-04-06T15:07:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.