Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
- URL: http://arxiv.org/abs/2210.07424v1
- Date: Thu, 13 Oct 2022 23:57:40 GMT
- Title: Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
- Authors: YuXuan Liu, Nikhil Mishra, Maximilian Sieb, Yide Shentu, Pieter
Abbeel, and Xi Chen
- Abstract summary: 3D bounding boxes are a widespread intermediate representation in many computer vision applications.
We propose methods for leveraging our autoregressive model to make high confidence predictions and meaningful uncertainty measures.
We release a simulated dataset, COB-3D, which highlights new types of ambiguity that arise in real-world robotics applications.
- Score: 63.3021778885906
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: 3D bounding boxes are a widespread intermediate representation in many
computer vision applications. However, predicting them is a challenging task,
largely due to partial observability, which motivates the need for a strong
sense of uncertainty. While many recent methods have explored better
architectures for consuming sparse and unstructured point cloud data, we
hypothesize that there is room for improvement in the modeling of the output
distribution and explore how this can be achieved using an autoregressive
prediction head. Additionally, we release a simulated dataset, COB-3D, which
highlights new types of ambiguity that arise in real-world robotics
applications, where 3D bounding box prediction has largely been underexplored.
We propose methods for leveraging our autoregressive model to make high
confidence predictions and meaningful uncertainty measures, achieving strong
results on SUN-RGBD, Scannet, KITTI, and our new dataset.
Related papers
- OccLoff: Learning Optimized Feature Fusion for 3D Occupancy Prediction [5.285847977231642]
3D semantic occupancy prediction is crucial for ensuring the safety in autonomous driving.
Existing fusion-based occupancy methods typically involve performing a 2D-to-3D view transformation on image features.
We propose OccLoff, a framework that Learns to optimize Feature Fusion for 3D occupancy prediction.
arXiv Detail & Related papers (2024-11-06T06:34:27Z) - AdaOcc: Adaptive-Resolution Occupancy Prediction [20.0994984349065]
We introduce AdaOcc, a novel adaptive-resolution, multi-modal prediction approach.
Our method integrates object-centric 3D reconstruction and holistic occupancy prediction within a single framework.
In close-range scenarios, we surpass previous baselines by over 13% in IOU, and over 40% in Hausdorff distance.
arXiv Detail & Related papers (2024-08-24T03:46:25Z) - A generic diffusion-based approach for 3D human pose prediction in the
wild [68.00961210467479]
3D human pose forecasting, i.e., predicting a sequence of future human 3D poses given a sequence of past observed ones, is a challenging-temporal task.
We provide a unified formulation in which incomplete elements (no matter in the prediction or observation) are treated as noise and propose a conditional diffusion model that denoises them and forecasts plausible poses.
We investigate our findings on four standard datasets and obtain significant improvements over the state-of-the-art.
arXiv Detail & Related papers (2022-10-11T17:59:54Z) - Pedestrian 3D Bounding Box Prediction [83.7135926821794]
We focus on 3D bounding boxes, which are reasonable estimates of humans without modeling complex motion details for autonomous vehicles.
We suggest this new problem and present a simple yet effective model for pedestrians' 3D bounding box prediction.
This method follows an encoder-decoder architecture based on recurrent neural networks.
arXiv Detail & Related papers (2022-06-28T17:59:45Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - SLPC: a VRNN-based approach for stochastic lidar prediction and
completion in autonomous driving [63.87272273293804]
We propose a new LiDAR prediction framework that is based on generative models namely Variational Recurrent Neural Networks (VRNNs)
Our algorithm is able to address the limitations of previous video prediction frameworks when dealing with sparse data by spatially inpainting the depth maps in the upcoming frames.
We present a sparse version of VRNNs and an effective self-supervised training method that does not require any labels.
arXiv Detail & Related papers (2021-02-19T11:56:44Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.