Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion
Forecasting with a Single Convolutional Net
- URL: http://arxiv.org/abs/2012.12395v1
- Date: Tue, 22 Dec 2020 22:43:35 GMT
- Title: Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion
Forecasting with a Single Convolutional Net
- Authors: Wenjie Luo, Bin Yang, Raquel Urtasun
- Abstract summary: We propose a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor.
Our approach performs 3D convolutions across space and time over a bird's eye view representation of the 3D world.
- Score: 93.51773847125014
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper we propose a novel deep neural network that is able to jointly
reason about 3D detection, tracking and motion forecasting given data captured
by a 3D sensor. By jointly reasoning about these tasks, our holistic approach
is more robust to occlusion as well as sparse data at range. Our approach
performs 3D convolutions across space and time over a bird's eye view
representation of the 3D world, which is very efficient in terms of both memory
and computation. Our experiments on a new very large scale dataset captured in
several north american cities, show that we can outperform the state-of-the-art
by a large margin. Importantly, by sharing computation we can perform all tasks
in as little as 30 ms.
Related papers
- DELTA: Dense Efficient Long-range 3D Tracking for any video [82.26753323263009]
We introduce DELTA, a novel method that efficiently tracks every pixel in 3D space, enabling accurate motion estimation across entire videos.
Our approach leverages a joint global-local attention mechanism for reduced-resolution tracking, followed by a transformer-based upsampler to achieve high-resolution predictions.
Our method provides a robust solution for applications requiring fine-grained, long-term motion tracking in 3D space.
arXiv Detail & Related papers (2024-10-31T17:59:01Z) - Moby: Empowering 2D Models for Efficient Point Cloud Analytics on the
Edge [11.588467580653608]
3D object detection plays a pivotal role in many applications, most notably autonomous driving and robotics.
With limited computation power, it is challenging to execute 3D detection on the edge using highly complex neural networks.
Common approaches such as offloading to the cloud induce significant latency overheads due to the large amount of point cloud data during transmission.
We present Moby, a novel system that demonstrates the feasibility and potential of our approach.
arXiv Detail & Related papers (2023-02-18T03:42:31Z) - A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds [50.54083964183614]
It is non-trivial to perform accurate target-specific detection since the point cloud of objects in raw LiDAR scans is usually sparse and incomplete.
We propose DMT, a Detector-free Motion prediction based 3D Tracking network that totally removes the usage of complicated 3D detectors.
arXiv Detail & Related papers (2022-03-08T17:49:07Z) - From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object
Detection [101.20784125067559]
We propose a new architecture, namely Hallucinated Hollow-3D R-CNN, to address the problem of 3D object detection.
In our approach, we first extract the multi-view features by sequentially projecting the point clouds into the perspective view and the bird-eye view.
The 3D objects are detected via a box refinement module with a novel Hierarchical Voxel RoI Pooling operation.
arXiv Detail & Related papers (2021-07-30T02:00:06Z) - AA3DNet: Attention Augmented Real Time 3D Object Detection [0.0]
We propose a novel neural network architecture along with the training and optimization details for detecting 3D objects using point cloud data.
Our method surpasses previous state of the art in this domain both in terms of average precision and speed running at > 30 FPS.
This makes it a feasible option to be deployed in real time applications like self driving cars.
arXiv Detail & Related papers (2021-07-26T12:18:23Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution [34.713667358316286]
Self-driving cars need to understand 3D scenes efficiently and accurately in order to drive safely.
Existing 3D perception models are not able to recognize small instances very well due to the low-resolution voxelization and aggressive downsampling.
We propose Sparse Point-Voxel Convolution (SPVConv), a lightweight 3D module that equips the vanilla Sparse Convolution with the high-resolution point-based branch.
arXiv Detail & Related papers (2020-07-31T14:27:27Z) - Generative Sparse Detection Networks for 3D Single-shot Object Detection [43.91336826079574]
3D object detection has been widely studied due to its potential applicability to many promising areas such as robotics and augmented reality.
Yet, the sparse nature of the 3D data poses unique challenges to this task.
We propose Generative Sparse Detection Network (GSDN), a fully-convolutional single-shot sparse detection network.
arXiv Detail & Related papers (2020-06-22T15:54:24Z) - RUHSNet: 3D Object Detection Using Lidar Data in Real Time [0.0]
We propose a novel neural network architecture for detecting 3D objects in point cloud data.
Our work surpasses the state of the art in this domain both in terms of average precision and speed running at > 30 FPS.
This makes it a feasible option to be deployed in real time applications including self driving cars.
arXiv Detail & Related papers (2020-05-09T09:41:46Z) - D3Feat: Joint Learning of Dense Detection and Description of 3D Local
Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds.
We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.