Related papers: Enhanced 3D Object Detection via Diverse Feature Representations of 4D Radar Tensor

Enhanced 3D Object Detection via Diverse Feature Representations of 4D Radar Tensor

URL: http://arxiv.org/abs/2502.06114v3
Date: Fri, 23 May 2025 06:17:06 GMT
Title: Enhanced 3D Object Detection via Diverse Feature Representations of 4D Radar Tensor
Authors: Seung-Hyun Song, Dong-Hee Paek, Minh-Quan Dao, Ezio Malis, Seung-Hyun Kong,
Abstract summary: Raw 4D Radar (4DRT) offers richer spatial and Doppler information than conventional point clouds.<n>We propose a novel 3D object detection framework that maximizes the utility of 4DRT while preserving efficiency.<n>We show that our framework achieves improvements of 7.3% in AP_3D and 9.5% in AP_BEV over the baseline RTNH model when using extremely sparse inputs.
Score: 5.038148262901536
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in automotive four-dimensional (4D) Radar have enabled access to raw 4D Radar Tensor (4DRT), offering richer spatial and Doppler information than conventional point clouds. While most existing methods rely on heavily pre-processed, sparse Radar data, recent attempts to leverage raw 4DRT face high computational costs and limited scalability. To address these limitations, we propose a novel three-dimensional (3D) object detection framework that maximizes the utility of 4DRT while preserving efficiency. Our method introduces a multi-teacher knowledge distillation (KD), where multiple teacher models are trained on point clouds derived from diverse 4DRT pre-processing techniques, each capturing complementary signal characteristics. These teacher representations are fused via a dedicated aggregation module and distilled into a lightweight student model that operates solely on a sparse Radar input. Experimental results on the K-Radar dataset demonstrate that our framework achieves improvements of 7.3% in AP_3D and 9.5% in AP_BEV over the baseline RTNH model when using extremely sparse inputs. Furthermore, it attains comparable performance to denser-input baselines while significantly reducing the input data size by about 90 times, confirming the scalability and efficiency of our approach.

Related papers

4D-ROLLS: 4D Radar Occupancy Learning via LiDAR Supervision [7.099012213719072]
We propose 4D-ROLLS, the first weakly supervised occupancy estimation method for 4D radar using the LiDAR point cloud as the supervisory signal.<n>The model is seamlessly transferred to downstream tasks BEV segmentation and point cloud occupancy prediction.<n>The lightweight network enables 4D-ROLLS model to achieve fast inference speeds at about 30 Hz on a 4060 GPU.
arXiv Detail & Related papers (2025-05-20T04:12:44Z)
TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion [54.46664104437454]
We propose TacoDepth, an efficient and accurate Radar-Camera depth estimation model with one-stage fusion. Specifically, the graph-based Radar structure extractor and the pyramid-based Radar fusion module are designed. Compared with the previous state-of-the-art approach, TacoDepth improves depth accuracy and processing speed by 12.8% and 91.8%.
arXiv Detail & Related papers (2025-04-16T05:25:04Z)
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training [48.87063562819018]
We introduce Easi3R, a simple yet efficient training-free method for 4D reconstruction.<n>Our approach applies attention adaptation during inference, eliminating the need for from-scratch pre-training or network fine-tuning.<n>Our experiments on real-world dynamic videos demonstrate that our lightweight attention adaptation significantly outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2025-03-31T17:59:58Z)
RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection [68.99784784185019]
Poor lighting or adverse weather conditions degrade camera performance. Radar suffers from noise and positional ambiguity. We propose RobuRCDet, a robust object detection model in BEV.
arXiv Detail & Related papers (2025-02-18T17:17:38Z)
TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection [6.163747364795787]
We present TransRAD, a novel 3D radar object detection model. We propose Location-Aware NMS to mitigate the common issue of duplicate bounding boxes in deep radar object detection. Results demonstrate that TransRAD outperforms state-of-the-art methods in both 2D and 3D radar detection tasks.
arXiv Detail & Related papers (2025-01-29T20:21:41Z)
SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection [16.127926058992237]
We propose a novel Semi-supervised Cross-modality Knowledge Distillation (SCKD) method for 4D radar-based 3D object detection.<n>It characterizes the capability of learning the feature from a Lidar-radar-fused teacher network with semi-supervised distillation.<n>With the same network structure, our radar-only student trained by SCKD boosts the mAP by 10.38% over the baseline.
arXiv Detail & Related papers (2024-12-19T06:42:25Z)
RadarPillars: Efficient Object Detection from 4D Radar Point Clouds [42.9356088038035]
We present RadarPillars, a pillar-based object detection network. By decomposing radial velocity data, RadarPillars significantly outperform state-of-the-art detection results on the View-of-Delft dataset. This comes at a significantly reduced parameter count, surpassing existing methods in terms of efficiency and enabling real-time performance on edge devices.
arXiv Detail & Related papers (2024-08-09T12:13:38Z)
RadarOcc: Robust 3D Occupancy Prediction with 4D Imaging Radar [15.776076554141687]
3D occupancy-based perception pipeline has significantly advanced autonomous driving. Current methods rely on LiDAR or camera inputs for 3D occupancy prediction. We introduce a novel approach that utilizes 4D imaging radar sensors for 3D occupancy prediction.
arXiv Detail & Related papers (2024-05-22T21:48:17Z)
Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar [62.51065633674272]
We introduce Radar Fields - a neural scene reconstruction method designed for active radar imagers. Our approach unites an explicit, physics-informed sensor model with an implicit neural geometry and reflectance model to directly synthesize raw radar measurements. We validate the effectiveness of the method across diverse outdoor scenarios, including urban scenes with dense vehicles and infrastructure.
arXiv Detail & Related papers (2024-05-07T20:44:48Z)
Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data [68.18735997052265]
We propose a balanced approach that combines the advantages of monocular and point cloud-based 3D detection. Our method requires only a small number of 3D points, that can be obtained from a low-cost, low-resolution sensor. The accuracy of 3D detection improves by 20% compared to the state-of-the-art monocular detection methods.
arXiv Detail & Related papers (2024-04-10T03:54:53Z)
MVFAN: Multi-View Feature Assisted Network for 4D Radar Object Detection [15.925365473140479]
4D radar is recognized for its resilience and cost-effectiveness under adverse weather conditions. Unlike LiDAR and cameras, radar remains unimpaired by harsh weather conditions. We propose a framework for developing radar-based 3D object detection for autonomous vehicles.
arXiv Detail & Related papers (2023-10-25T06:10:07Z)
4DRVO-Net: Deep 4D Radar-Visual Odometry Using Multi-Modal and Multi-Scale Adaptive Fusion [2.911052912709637]
Four-dimensional (4D) radar--visual odometry (4DRVO) integrates complementary information from 4D radar and cameras. 4DRVO may exhibit significant tracking errors owing to sparsity of 4D radar point clouds. We present 4DRVO-Net, which is a method for 4D radar--visual odometry.
arXiv Detail & Related papers (2023-08-12T14:00:09Z)
RadarFormer: Lightweight and Accurate Real-Time Radar Object Detection Model [13.214257841152033]
Radar-centric data sets do not get a lot of attention in the development of deep learning techniques for radar perception. We propose a transformers-based model, named RadarFormer, that utilizes state-of-the-art developments in vision deep learning. Our model also introduces a channel-chirp-time merging module that reduces the size and complexity of our models by more than 10 times without compromising accuracy.
arXiv Detail & Related papers (2023-04-17T17:07:35Z)
Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector. The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference. Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z)
EGFN: Efficient Geometry Feature Network for Fast Stereo 3D Object Detection [51.52496693690059]
Fast stereo based 3D object detectors lag far behind high-precision oriented methods in accuracy. We argue that the main reason is the missing or poor 3D geometry feature representation in fast stereo based methods. The proposed EGFN outperforms YOLOStsereo3D, the advanced fast method, by 5.16% on mAP$_3d$ at the cost of merely additional 12 ms.
arXiv Detail & Related papers (2021-11-28T05:25:36Z)
R4Dyn: Exploring Radar for Self-Supervised Monocular Depth Estimation of Dynamic Scenes [69.6715406227469]
Self-supervised monocular depth estimation in driving scenarios has achieved comparable performance to supervised approaches. We present R4Dyn, a novel set of techniques to use cost-efficient radar data on top of a self-supervised depth estimation framework.
arXiv Detail & Related papers (2021-08-10T17:57:03Z)
Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object Detection [59.765645791588454]
Recently introduced RTS3D builds an efficient 4D Feature-Consistency Embedding space for the intermediate representation of object without depth supervision. We propose a shape prior non-uniform sampling strategy that performs dense sampling in outer region and sparse sampling in inner region. Our proposed method has 2.57% improvement on AP3d almost without extra network parameters.
arXiv Detail & Related papers (2021-06-18T09:14:55Z)
LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion [52.59664614744447]
We present LiRaNet, a novel end-to-end trajectory prediction method which utilizes radar sensor information along with widely used lidar and high definition (HD) maps. automotive radar provides rich, complementary information, allowing for longer range vehicle detection as well as instantaneous velocity measurements.
arXiv Detail & Related papers (2020-10-02T00:13:00Z)
RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects [73.80316195652493]
We tackle the problem of exploiting Radar for perception in the context of self-driving cars. We propose a new solution that exploits both LiDAR and Radar sensors for perception. Our approach, dubbed RadarNet, features a voxel-based early fusion and an attention-based late fusion.
arXiv Detail & Related papers (2020-07-28T17:15:02Z)
Towards Reading Beyond Faces for Sparsity-Aware 4D Affect Recognition [55.15661254072032]
We present a sparsity-aware deep network for automatic 4D facial expression recognition (FER) We first propose a novel augmentation method to combat the data limitation problem for deep learning. We then present a sparsity-aware deep network to compute the sparse representations of convolutional features over multi-views.
arXiv Detail & Related papers (2020-02-08T13:09:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.