Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement
- URL: http://arxiv.org/abs/2408.07999v2
- Date: Fri, 15 Nov 2024 04:09:34 GMT
- Title: Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement
- Authors: Wenxuan Li, Qin Zou, Chi Chen, Bo Du, Long Chen, Jian Zhou, Hongkai Yu,
- Abstract summary: 3D object detection in driving scenarios faces the challenge of complex road environments.
We propose an advanced detection framework called Co-Fix3D.
Co-Fix3D integrates Local and Global Enhancement (LGE) modules to refine Bird's Eye View (BEV) features.
- Score: 37.24731059950228
- License:
- Abstract: 3D object detection in driving scenarios faces the challenge of complex road environments, which can lead to the loss or incompleteness of key features, thereby affecting perception performance. To address this issue, we propose an advanced detection framework called Co-Fix3D. Co-Fix3D integrates Local and Global Enhancement (LGE) modules to refine Bird's Eye View (BEV) features. The LGE module uses Discrete Wavelet Transform (DWT) for pixel-level local optimization and incorporates an attention mechanism for global optimization. To handle varying detection difficulties, we adopt multi-head LGE modules, enabling each module to focus on targets with different levels of detection complexity, thus further enhancing overall perception capability. Experimental results show that on the nuScenes dataset's LiDAR benchmark, Co-Fix3D achieves 69.4\% mAP and 73.5\% NDS, while on the multimodal benchmark, it achieves 72.3\% mAP and 74.7\% NDS. The source code is publicly available at \href{https://github.com/rubbish001/Co-Fix3d}{https://github.com/rubbish001/Co-Fix3d}.
Related papers
- GAFusion: Adaptive Fusing LiDAR and Camera with Multiple Guidance for 3D Object Detection [36.37236815038332]
We propose a novel multi-modality 3D objection detection method, named GAFusion, with LiDAR-guided global interaction and adaptive fusion.
GAFusion achieves state-of-the-art 3D object detection results with 73.6$%$ mAP and 74.9$%$ NDS on the nuScenes test set.
arXiv Detail & Related papers (2024-11-01T03:40:24Z) - GOOD: General Optimization-based Fusion for 3D Object Detection via
LiDAR-Camera Object Candidates [10.534984939225014]
3D object detection serves as the core basis of the perception tasks in autonomous driving.
Good is a general optimization-based fusion framework that can achieve satisfying detection without training additional models.
Experiments on both nuScenes and KITTI datasets are carried out and the results show that GOOD outperforms by 9.1% on mAP score compared with PointPillars.
arXiv Detail & Related papers (2023-03-17T07:05:04Z) - SSDA3D: Semi-supervised Domain Adaptation for 3D Object Detection from
Point Cloud [125.9472454212909]
We present a novel Semi-Supervised Domain Adaptation method for 3D object detection (SSDA3D)
SSDA3D includes an Inter-domain Adaptation stage and an Intra-domain Generalization stage.
Experiments show that, with only 10% labeled target data, our SSDA3D can surpass the fully-supervised oracle model with 100% target label.
arXiv Detail & Related papers (2022-12-06T09:32:44Z) - 3DLG-Detector: 3D Object Detection via Simultaneous Local-Global Feature
Learning [15.995277437128452]
Capturing both local and global features of irregular point clouds is essential to 3D object detection (3OD)
This paper explores new modules to simultaneously learn local-global features of scene point clouds that serve 3OD positively.
We propose an effective 3OD network via simultaneous local-global feature learning (dubbed 3DLG-Detector)
arXiv Detail & Related papers (2022-08-31T12:23:40Z) - Towards Model Generalization for Monocular 3D Object Detection [57.25828870799331]
We present an effective unified camera-generalized paradigm (CGP) for Mono3D object detection.
We also propose the 2D-3D geometry-consistent object scaling strategy (GCOS) to bridge the gap via an instance-level augment.
Our method called DGMono3D achieves remarkable performance on all evaluated datasets and surpasses the SoTA unsupervised domain adaptation scheme.
arXiv Detail & Related papers (2022-05-23T23:05:07Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object
Detection [69.68263074432224]
We present a novel framework named ZoomNet for stereo imagery-based 3D detection.
The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes.
To further exploit the abundant texture cues in RGB images for more accurate disparity estimation, we introduce a conceptually straight-forward module -- adaptive zooming.
arXiv Detail & Related papers (2020-03-01T17:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.