Perspective-Invariant 3D Object Detection
- URL: http://arxiv.org/abs/2507.17665v1
- Date: Wed, 23 Jul 2025 16:29:57 GMT
- Title: Perspective-Invariant 3D Object Detection
- Authors: Ao Liang, Lingdong Kong, Dongyue Lu, Youquan Liu, Jian Fang, Huaici Zhao, Wei Tsang Ooi,
- Abstract summary: We introduce Pi3DET, the first benchmark featuring LiDAR data and 3D bounding box annotations collected from multiple platforms.<n>We propose a novel cross-platform adaptation framework that transfers knowledge from the well-studied vehicle platform to other platforms.<n>We establish a benchmark to evaluate the resilience and robustness of current 3D detectors in cross-platform scenarios.
- Score: 6.7936328711743
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: With the rise of robotics, LiDAR-based 3D object detection has garnered significant attention in both academia and industry. However, existing datasets and methods predominantly focus on vehicle-mounted platforms, leaving other autonomous platforms underexplored. To bridge this gap, we introduce Pi3DET, the first benchmark featuring LiDAR data and 3D bounding box annotations collected from multiple platforms: vehicle, quadruped, and drone, thereby facilitating research in 3D object detection for non-vehicle platforms as well as cross-platform 3D detection. Based on Pi3DET, we propose a novel cross-platform adaptation framework that transfers knowledge from the well-studied vehicle platform to other platforms. This framework achieves perspective-invariant 3D detection through robust alignment at both geometric and feature levels. Additionally, we establish a benchmark to evaluate the resilience and robustness of current 3D detectors in cross-platform scenarios, providing valuable insights for developing adaptive 3D perception systems. Extensive experiments validate the effectiveness of our approach on challenging cross-platform tasks, demonstrating substantial gains over existing adaptation methods. We hope this work paves the way for generalizable and unified 3D perception systems across diverse and complex environments. Our Pi3DET dataset, cross-platform benchmark suite, and annotation toolkit have been made publicly available.
Related papers
- 3EED: Ground Everything Everywhere in 3D [64.28938045702077]
3EED is a multi-platform, multi-modal 3D grounding benchmark featuring RGB and LiDAR data from vehicle, drone, and quadruped platforms.<n>We provide over 128,000 objects and 22,000 validated referring expressions across diverse outdoor scenes -- 10x larger than existing datasets.
arXiv Detail & Related papers (2025-11-03T17:05:22Z) - HeightFormer: A Semantic Alignment Monocular 3D Object Detection Method from Roadside Perspective [11.841338298700421]
We propose a novel 3D object detection framework integrating Spatial Former and Voxel Pooling Former to enhance 2D-to-3D projection based on height estimation.
Experiments were conducted using the Rope3D and DAIR-V2X-I dataset, and the results demonstrated the outperformance of the proposed algorithm in the detection of both vehicles and cyclists.
arXiv Detail & Related papers (2024-10-10T09:37:33Z) - CatFree3D: Category-agnostic 3D Object Detection with Diffusion [63.75470913278591]
We introduce a novel pipeline that decouples 3D detection from 2D detection and depth prediction.
We also introduce the Normalised Hungarian Distance (NHD) metric for an accurate evaluation of 3D detection results.
arXiv Detail & Related papers (2024-08-22T22:05:57Z) - TAPVid-3D: A Benchmark for Tracking Any Point in 3D [63.060421798990845]
We introduce a new benchmark, TAPVid-3D, for evaluating the task of Tracking Any Point in 3D.
This benchmark will serve as a guidepost to improve our ability to understand precise 3D motion and surface deformation from monocular video.
arXiv Detail & Related papers (2024-07-08T13:28:47Z) - Every Dataset Counts: Scaling up Monocular 3D Object Detection with Joint Datasets Training [9.272389295055271]
This study investigates the pipeline for training a monocular 3D object detection model on a diverse collection of 3D and 2D datasets.
The proposed framework comprises three components: (1) a robust monocular 3D model capable of functioning across various camera settings, (2) a selective-training strategy to accommodate datasets with differing class annotations, and (3) a pseudo 3D training approach using 2D labels to enhance detection performance in scenes containing only 2D labels.
arXiv Detail & Related papers (2023-10-02T06:17:24Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - PatchContrast: Self-Supervised Pre-training for 3D Object Detection [14.493213289990962]
We introduce PatchContrast, a novel self-supervised point cloud pre-training framework for 3D object detection.<n>We show that our method outperforms existing state-of-the-art models on three commonly-used 3D detection datasets.
arXiv Detail & Related papers (2023-08-14T07:45:54Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - High-level camera-LiDAR fusion for 3D object detection with machine
learning [0.0]
This paper tackles the 3D object detection problem, which is of vital importance for applications such as autonomous driving.
It uses a Machine Learning pipeline on a combination of monocular camera and LiDAR data to detect vehicles in the surrounding 3D space of a moving platform.
Our results demonstrate an efficient and accurate inference on a validation set, achieving an overall accuracy of 87.1%.
arXiv Detail & Related papers (2021-05-24T01:57:34Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - ST3D: Self-training for Unsupervised Domain Adaptation on 3D
ObjectDetection [78.71826145162092]
We present a new domain adaptive self-training pipeline, named ST3D, for unsupervised domain adaptation on 3D object detection from point clouds.
Our ST3D achieves state-of-the-art performance on all evaluated datasets and even surpasses fully supervised results on KITTI 3D object detection benchmark.
arXiv Detail & Related papers (2021-03-09T10:51:24Z) - BirdNet+: End-to-End 3D Object Detection in LiDAR Bird's Eye View [117.44028458220427]
On-board 3D object detection in autonomous vehicles often relies on geometry information captured by LiDAR devices.
We present a fully end-to-end 3D object detection framework that can infer oriented 3D boxes solely from BEV images.
arXiv Detail & Related papers (2020-03-09T15:08:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.