Related papers: Adaptive LiDAR Scanning: Harnessing Temporal Cues for Efficient 3D Object Detection via Multi-Modal Fusion

Adaptive LiDAR Scanning: Harnessing Temporal Cues for Efficient 3D Object Detection via Multi-Modal Fusion

URL: http://arxiv.org/abs/2508.01562v1
Date: Sun, 03 Aug 2025 03:20:36 GMT
Title: Adaptive LiDAR Scanning: Harnessing Temporal Cues for Efficient 3D Object Detection via Multi-Modal Fusion
Authors: Sara Shoouri, Morteza Tavakoli Taba, Hun-Seok Kim,
Abstract summary: Conventional LiDAR sensors perform dense, stateless scans, ignoring the strong temporal continuity in real-world scenes.<n>We propose a predictive, history-aware adaptive scanning framework that anticipates informative regions of interest based on past observations.<n>Our method significantly reduces unnecessary data acquisition by concentrating dense LiDAR scanning only within these ROIs and sparsely sampling elsewhere.
Score: 11.351728925952193
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-sensor fusion using LiDAR and RGB cameras significantly enhances 3D object detection task. However, conventional LiDAR sensors perform dense, stateless scans, ignoring the strong temporal continuity in real-world scenes. This leads to substantial sensing redundancy and excessive power consumption, limiting their practicality on resource-constrained platforms. To address this inefficiency, we propose a predictive, history-aware adaptive scanning framework that anticipates informative regions of interest (ROI) based on past observations. Our approach introduces a lightweight predictor network that distills historical spatial and temporal contexts into refined query embeddings. These embeddings guide a differentiable Mask Generator network, which leverages Gumbel-Softmax sampling to produce binary masks identifying critical ROIs for the upcoming frame. Our method significantly reduces unnecessary data acquisition by concentrating dense LiDAR scanning only within these ROIs and sparsely sampling elsewhere. Experiments on nuScenes and Lyft benchmarks demonstrate that our adaptive scanning strategy reduces LiDAR energy consumption by over 65% while maintaining competitive or even superior 3D object detection performance compared to traditional LiDAR-camera fusion methods with dense LiDAR scanning.

Related papers

Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB [12.38882701862349]
3D surface reconstruction is essential across applications of virtual reality, robotics, and mobile scanning.<n> RGB-based reconstruction often fails in low-texture, low-light, and low-albedo scenes.<n>We propose using an alternative class of "blurred" LiDAR that emits a diffuse flash.
arXiv Detail & Related papers (2024-11-29T05:01:23Z)
Gait Sequence Upsampling using Diffusion Models for Single LiDAR Sensors [1.0485739694839664]
LidarGSU is designed to improve the generalization capability of existing identification models. In this work, we leverage DPMs on sparse sequential pedestrian point clouds as conditional masks in a video-to-video translation approach. We conduct extensive experiments on the SUSTeck1K dataset to evaluate the generative quality and recognition performance of the proposed method.
arXiv Detail & Related papers (2024-10-11T10:11:21Z)
LiDAR-GS:Real-time LiDAR Re-Simulation using Gaussian Splatting [50.808933338389686]
We present LiDAR-GS, a real-time, high-fidelity re-simulation of LiDAR scans in public urban road scenes.<n>The method achieves state-of-the-art results in both rendering frame rate and quality on publically available large scene datasets.
arXiv Detail & Related papers (2024-10-07T15:07:56Z)
VaLID: Verification as Late Integration of Detections for LiDAR-Camera Fusion [2.503388496100123]
Vehicle object detection benefits from both LiDAR and camera data.<n>We propose a model-adaptive late-fusion method, VaLID, which validates whether each predicted bounding box is acceptable.<n>Our approach is model-adaptive and demonstrates state-of-the-art competitive performance even when using generic camera detectors.
arXiv Detail & Related papers (2024-09-23T20:27:10Z)
Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving [58.16024314532443]
We introduce LaserMix++, a framework that integrates laser beam manipulations from disparate LiDAR scans and incorporates LiDAR-camera correspondences to assist data-efficient learning.<n>Results demonstrate that LaserMix++ outperforms fully supervised alternatives, achieving comparable accuracy with five times fewer annotations.<n>This substantial advancement underscores the potential of semi-supervised approaches in reducing the reliance on extensive labeled data in LiDAR-based 3D scene understanding systems.
arXiv Detail & Related papers (2024-05-08T17:59:53Z)
Better Monocular 3D Detectors with LiDAR from the Past [64.6759926054061]
Camera-based 3D detectors often suffer inferior performance compared to LiDAR-based counterparts due to inherent depth ambiguities in images. In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data. We show consistent and significant performance gain across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.
arXiv Detail & Related papers (2024-04-08T01:38:43Z)
TimePillars: Temporally-Recurrent 3D LiDAR Object Detection [8.955064958311517]
TimePillars is a temporally-recurrent object detection pipeline. It exploits the pillar representation of LiDAR data across time. We show how basic building blocks are enough to achieve robust and efficient results.
arXiv Detail & Related papers (2023-12-22T10:25:27Z)
Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector. The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference. Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z)
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection [96.63947479020631]
In many real-world applications, the LiDAR points used by mass-produced robots and vehicles usually have fewer beams than that in large-scale public datasets. We propose the LiDAR Distillation to bridge the domain gap induced by different LiDAR beams for 3D object detection.
arXiv Detail & Related papers (2022-03-28T17:59:02Z)
LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications. We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation. Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.