3D-MAN: 3D Multi-frame Attention Network for Object Detection
- URL: http://arxiv.org/abs/2103.16054v1
- Date: Tue, 30 Mar 2021 03:44:22 GMT
- Title: 3D-MAN: 3D Multi-frame Attention Network for Object Detection
- Authors: Zetong Yang, Yin Zhou, Zhifeng Chen, Jiquan Ngiam
- Abstract summary: 3D-MAN is a 3D multi-frame attention network that effectively aggregates features from multiple perspectives.
We show that 3D-MAN achieves state-of-the-art results compared to published single-frame and multi-frame methods.
- Score: 22.291051951077485
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D object detection is an important module in autonomous driving and
robotics. However, many existing methods focus on using single frames to
perform 3D detection, and do not fully utilize information from multiple
frames. In this paper, we present 3D-MAN: a 3D multi-frame attention network
that effectively aggregates features from multiple perspectives and achieves
state-of-the-art performance on Waymo Open Dataset. 3D-MAN first uses a novel
fast single-frame detector to produce box proposals. The box proposals and
their corresponding feature maps are then stored in a memory bank. We design a
multi-view alignment and aggregation module, using attention networks, to
extract and aggregate the temporal features stored in the memory bank. This
effectively combines the features coming from different perspectives of the
scene. We demonstrate the effectiveness of our approach on the large-scale
complex Waymo Open Dataset, achieving state-of-the-art results compared to
published single-frame and multi-frame methods.
Related papers
- 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - SCA-PVNet: Self-and-Cross Attention Based Aggregation of Point Cloud and
Multi-View for 3D Object Retrieval [8.74845857766369]
Multi-modality 3D object retrieval is rarely developed and analyzed on large-scale datasets.
We propose self-and-cross attention based aggregation of point cloud and multi-view images (SCA-PVNet) for 3D object retrieval.
arXiv Detail & Related papers (2023-07-20T05:46:32Z) - Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream.
At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank.
To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z) - DETR4D: Direct Multi-View 3D Object Detection with Sparse Attention [50.11672196146829]
3D object detection with surround-view images is an essential task for autonomous driving.
We propose DETR4D, a Transformer-based framework that explores sparse attention and direct feature query for 3D object detection in multi-view images.
arXiv Detail & Related papers (2022-12-15T14:18:47Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving.
We present SimMOD, a Simple baseline for Multi-camera Object Detection.
We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z) - SRCN3D: Sparse R-CNN 3D for Compact Convolutional Multi-View 3D Object
Detection and Tracking [12.285423418301683]
This paper proposes Sparse R-CNN 3D (SRCN3D), a novel two-stage fully-sparse detector that incorporates sparse queries, sparse attention with box-wise sampling, and sparse prediction.
Experiments on nuScenes dataset demonstrate that SRCN3D achieves competitive performance in both 3D object detection and multi-object tracking tasks.
arXiv Detail & Related papers (2022-06-29T07:58:39Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking
from View Aggregation [8.854112907350624]
3D multi-object tracking plays a vital role in autonomous navigation.
Many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space.
We propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames.
arXiv Detail & Related papers (2020-11-25T16:14:40Z) - An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds [16.658604637005535]
We propose a sparse LSTM-based multi-frame 3d object detection algorithm.
We use a U-Net style 3D sparse convolution network to extract features for each frame's LiDAR point-cloud.
arXiv Detail & Related papers (2020-07-24T07:34:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.