Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps
- URL: http://arxiv.org/abs/2207.02519v1
- Date: Wed, 6 Jul 2022 08:52:12 GMT
- Title: Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps
- Authors: Alessandro Simoni, Stefano Pini, Guido Borghi, Roberto Vezzani
- Abstract summary: Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
- Score: 66.24554680709417
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Knowing the exact 3D location of workers and robots in a collaborative
environment enables several real applications, such as the detection of unsafe
situations or the study of mutual interactions for statistical and social
purposes. In this paper, we propose a non-invasive and light-invariant
framework based on depth devices and deep neural networks to estimate the 3D
pose of robots from an external camera. The method can be applied to any robot
without requiring hardware access to the internal states. We introduce a novel
representation of the predicted pose, namely Semi-Perspective Decoupled
Heatmaps (SPDH), to accurately compute 3D joint locations in world coordinates
adapting efficient deep networks designed for the 2D Human Pose Estimation. The
proposed approach, which takes as input a depth representation based on XYZ
coordinates, can be trained on synthetic depth data and applied to real-world
settings without the need for domain adaptation techniques. To this end, we
present the SimBa dataset, based on both synthetic and real depth images, and
use it for the experimental evaluation. Results show that the proposed
approach, made of a specific depth map representation and the SPDH, overcomes
the current state of the art.
Related papers
- Calibrating Panoramic Depth Estimation for Practical Localization and
Mapping [20.621442016969976]
The absolute depth values of surrounding environments provide crucial cues for various assistive technologies, such as localization, navigation, and 3D structure estimation.
We propose that accurate depth estimated from panoramic images can serve as a powerful and light-weight input for a wide range of downstream tasks requiring 3D information.
arXiv Detail & Related papers (2023-08-27T04:50:05Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - Scene-aware Egocentric 3D Human Pose Estimation [72.57527706631964]
Egocentric 3D human pose estimation with a single head-mounted fisheye camera has recently attracted attention due to its numerous applications in virtual and augmented reality.
Existing methods still struggle in challenging poses where the human body is highly occluded or is closely interacting with the scene.
We propose a scene-aware egocentric pose estimation method that guides the prediction of the egocentric pose with scene constraints.
arXiv Detail & Related papers (2022-12-20T21:35:39Z) - iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [64.80458128766254]
iSDF is a continuous learning system for real-time signed distance field reconstruction.
It produces more accurate reconstructions and better approximations of collision costs and gradients.
arXiv Detail & Related papers (2022-04-05T15:48:39Z) - Sparse Depth Completion with Semantic Mesh Deformation Optimization [4.03103540543081]
We propose a neural network with post-optimization, which takes an RGB image and sparse depth samples as input and predicts the complete depth map.
Our evaluation results outperform the existing work consistently on both indoor and outdoor datasets.
arXiv Detail & Related papers (2021-12-10T13:01:06Z) - Real-Time Multi-View 3D Human Pose Estimation using Semantic Feedback to
Smart Edge Sensors [28.502280038100167]
2D joint detection for each camera view is performed locally on a dedicated embedded inference processor.
3D poses are recovered from 2D joints on a central backend, based on triangulation and a body model.
The whole pipeline is capable of real-time operation.
arXiv Detail & Related papers (2021-06-28T14:00:00Z) - On the role of depth predictions for 3D human pose estimation [0.04199844472131921]
We build a system that takes 2d joint locations as input along with their estimated depth value and predicts their 3d positions in camera coordinates.
Results are produced on neural network that accepts a low dimensional input and be integrated into a real-time system.
Our system can be combined with an off-the-shelf 2d pose detector and a depth map predictor to perform 3d pose estimation in the wild.
arXiv Detail & Related papers (2021-03-03T16:51:38Z) - Ground-aware Monocular 3D Object Detection for Autonomous Driving [6.5702792909006735]
Estimating the 3D position and orientation of objects in the environment with a single RGB camera is a challenging task for low-cost urban autonomous driving and mobile robots.
Most of the existing algorithms are based on the geometric constraints in 2D-3D correspondence, which stems from generic 6D object pose estimation.
We introduce a novel neural network module to fully utilize such application-specific priors in the framework of deep learning.
arXiv Detail & Related papers (2021-02-01T08:18:24Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.