Rethinking the Encoding and Annotating of 3D Bounding Box: Corner-Aware 3D Object Detection from Point Clouds
- URL: http://arxiv.org/abs/2511.17619v1
- Date: Tue, 18 Nov 2025 13:49:30 GMT
- Title: Rethinking the Encoding and Annotating of 3D Bounding Box: Corner-Aware 3D Object Detection from Point Clouds
- Authors: Qinghao Meng, Junbo Yin, Jianbing Shen, Yunde Jia,
- Abstract summary: Center-aligned regression remains dominant in LiDAR-based 3D object detection, yet it suffers from fundamental instability.<n>We propose corner-aligned regression, which shifts the prediction target from unstable centers to geometrically informative corners.<n>We design a simple yet effective corner-aware detection head that can be plugged into existing detectors.
- Score: 69.84768614060559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Center-aligned regression remains dominant in LiDAR-based 3D object detection, yet it suffers from fundamental instability: object centers often fall in sparse or empty regions of the bird's-eye-view (BEV) due to the front-surface-biased nature of LiDAR point clouds, leading to noisy and inaccurate bounding box predictions. To circumvent this limitation, we revisit bounding box representation and propose corner-aligned regression, which shifts the prediction target from unstable centers to geometrically informative corners that reside in dense, observable regions. Leveraging the inherent geometric constraints among corners and image 2D boxes, partial parameters of 3D bounding boxes can be recovered from corner annotations, enabling a weakly supervised paradigm without requiring complete 3D labels. We design a simple yet effective corner-aware detection head that can be plugged into existing detectors. Experiments on KITTI show our method improves performance by 3.5% AP over center-based baseline, and achieves 83% of fully supervised accuracy using only BEV corner clicks, demonstrating the effectiveness of our corner-aware regression strategy.
Related papers
- SPAN: Spatial-Projection Alignment for Monocular 3D Object Detection [49.12928389918159]
Existing monocular 3D detectors typically tame the pronounced nonlinear regression of 3D bounding box through decoupled prediction paradigm.<n>We propose novel Spatial-Projection Alignment (SPAN) with two pivotal components.<n>SPAN enforces an explicit global spatial constraint between the predicted and ground-truth 3D bounding boxes, thereby rectifying spatial drift caused by decoupled attribute regression.<n>3D-2D Projection Alignment ensures that the projected 3D box is aligned tightly within its corresponding 2D detection bounding box on the image plane, mitigating projection misalignment overlooked in previous works.
arXiv Detail & Related papers (2025-11-10T04:48:48Z) - RQR3D: Reparametrizing the regression targets for BEV-based 3D object detection [0.4604003661048266]
Bird's-eye view (BEV)-based perception approaches have emerged as superior alternatives to perspective-based solutions.<n>We propose Restricted Quadrilateral Representation to define 3D regression targets.<n>RQR3D regresses the smallest horizontal bounding box encapsulating the oriented box, along with the offsets between the corners of these two boxes.
arXiv Detail & Related papers (2025-05-23T10:52:34Z) - CornerPoint3D: Look at the Nearest Corner Instead of the Center [7.293031759018836]
3D object detection aims to predict object centers, dimensions, and rotations from LiDAR point clouds.<n>LiDAR captures only the near side of objects, making center-based detectors prone to poor localization accuracy in cross-domain tasks.<n>We propose a novel 3D object detector, coined as CornerPoint3D, which is built upon CenterPoint and uses heatmaps to supervise the learning and detection of the nearest corner of each object.
arXiv Detail & Related papers (2025-04-03T10:33:43Z) - OriCon3D: Effective 3D Object Detection using Orientation and Confidence [0.0]
We propose an advanced methodology for the detection of 3D objects from a single image.
We use a deep convolutional neural network-based 3D object weighted orientation regression paradigm.
Our approach significantly improves the accuracy of 3D object pose determination, surpassing baseline methodologies.
arXiv Detail & Related papers (2023-04-27T19:52:47Z) - OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object
Detection [51.153003057515754]
OPA-3D is a single-stage, end-to-end, Occlusion-Aware Pixel-Wise Aggregation network.
It jointly estimates dense scene depth with depth-bounding box residuals and object bounding boxes.
It outperforms state-of-the-art methods on the main Car category.
arXiv Detail & Related papers (2022-11-02T14:19:13Z) - ImpDet: Exploring Implicit Fields for 3D Object Detection [74.63774221984725]
We introduce a new perspective that views bounding box regression as an implicit function.
This leads to our proposed framework, termed Implicit Detection or ImpDet.
Our ImpDet assigns specific values to points in different local 3D spaces, thereby high-quality boundaries can be generated.
arXiv Detail & Related papers (2022-03-31T17:52:12Z) - CG-SSD: Corner Guided Single Stage 3D Object Detection from LiDAR Point
Cloud [4.110053032708927]
In a real world scene, the LiDAR can only acquire a limited object surface point clouds, but the center point of the object does not exist.
We propose a corner-guided anchor-free single-stage 3D object detection model (CG-SSD)
CG-SSD achieves the state-of-art performance on the ONCE benchmark for supervised 3D object detection using single frame point cloud data.
arXiv Detail & Related papers (2022-02-24T02:30:15Z) - Anchor-free 3D Single Stage Detector with Mask-Guided Attention for
Point Cloud [79.39041453836793]
We develop a novel single-stage 3D detector for point clouds in an anchor-free manner.
We overcome this by converting the voxel-based sparse 3D feature volumes into the sparse 2D feature maps.
We propose an IoU-based detection confidence re-calibration scheme to improve the correlation between the detection confidence score and the accuracy of the bounding box regression.
arXiv Detail & Related papers (2021-08-08T13:42:13Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR
Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern.
Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.