CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection
- URL: http://arxiv.org/abs/2209.06641v1
- Date: Tue, 13 Sep 2022 05:26:09 GMT
- Title: CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection
- Authors: Dhanalaxmi Gaddam, Jean Lahoud, Fahad Shahbaz Khan, Rao Muhammad
Anwer, Hisham Cholakkal
- Abstract summary: We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
- Score: 57.44434974289945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing deep learning-based 3D object detectors typically rely on the
appearance of individual objects and do not explicitly pay attention to the
rich contextual information of the scene. In this work, we propose
Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D)
framework, which takes a 3D scene as input and strives to explicitly integrate
useful contextual information of the scene at multiple levels to predict a set
of object bounding-boxes along with their corresponding semantic labels. To
this end, we propose to utilize a context enhancement network that captures the
contextual information at different levels of granularity followed by a
multi-stage refinement module to progressively refine the box positions and
class predictions. Extensive experiments on the large-scale ScanNetV2 benchmark
reveal the benefits of our proposed method, leading to an absolute improvement
of 2.0% over the baseline. In addition to 3D object detection, we investigate
the effectiveness of our CMR3D framework for the problem of 3D object counting.
Our source code will be publicly released.
Related papers
- MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations [55.022519020409405]
This paper builds the first largest ever multi-modal 3D scene dataset and benchmark with hierarchical grounded language annotations, MMScan.
The resulting multi-modal 3D dataset encompasses 1.4M meta-annotated captions on 109k objects and 7.7k regions as well as over 3.04M diverse samples for 3D visual grounding and question-answering benchmarks.
arXiv Detail & Related papers (2024-06-13T17:59:30Z) - Multi-View Attentive Contextualization for Multi-View 3D Object Detection [19.874148893464607]
We present Multi-View Attentive Contextualization (MvACon), a simple yet effective method for improving 2D-to-3D feature lifting in query-based 3D (MV3D) object detection.
In experiments, the proposed MvACon is thoroughly tested on the nuScenes benchmark, using both the BEVFormer and its recent 3D deformable attention (DFA3D) variant, as well as the PETR.
arXiv Detail & Related papers (2024-05-20T17:37:10Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - Surface-biased Multi-Level Context 3D Object Detection [1.9723551683930771]
This work addresses the object detection task in 3D point clouds using a highly efficient, surface-biased, feature extraction method (wang2022rbgnet)
We propose a 3D object detector that extracts accurate feature representations of object candidates and leverages self-attention on point patches, object candidates, and on the global scene in 3D scene.
arXiv Detail & Related papers (2023-02-13T11:50:04Z) - OA-BEV: Bringing Object Awareness to Bird's-Eye-View Representation for
Multi-Camera 3D Object Detection [78.38062015443195]
OA-BEV is a network that can be plugged into the BEV-based 3D object detection framework.
Our method achieves consistent improvements over the BEV-based baselines in terms of both average precision and nuScenes detection score.
arXiv Detail & Related papers (2023-01-13T06:02:31Z) - HyperDet3D: Learning a Scene-conditioned 3D Object Detector [154.84798451437032]
We propose HyperDet3D to explore scene-conditioned prior knowledge for 3D object detection.
Our HyperDet3D achieves state-of-the-art results on the 3D object detection benchmark of the ScanNet and SUN RGB-D datasets.
arXiv Detail & Related papers (2022-04-12T07:57:58Z) - Stereo Object Matching Network [78.35697025102334]
This paper presents a stereo object matching method that exploits both 2D contextual information from images and 3D object-level information.
We present two novel strategies to handle 3D objectness in the cost volume space: selective sampling (RoISelect) and 2D-3D fusion.
arXiv Detail & Related papers (2021-03-23T12:54:43Z) - Multi-Task Multi-Sensor Fusion for 3D Object Detection [93.68864606959251]
We present an end-to-end learnable architecture that reasons about 2D and 3D object detection as well as ground estimation and depth completion.
Our experiments show that all these tasks are complementary and help the network learn better representations by fusing information at various levels.
arXiv Detail & Related papers (2020-12-22T22:49:15Z) - Weakly Supervised 3D Object Detection from Point Clouds [27.70180601788613]
3D object detection aims to detect and localize the 3D bounding boxes of objects belonging to specific classes.
Existing 3D object detectors rely on annotated 3D bounding boxes during training, while these annotations could be expensive to obtain and only accessible in limited scenarios.
We propose VS3D, a framework for weakly supervised 3D object detection from point clouds without using any ground truth 3D bounding box for training.
arXiv Detail & Related papers (2020-07-28T03:30:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.