Real-Time Multi-View 3D Human Pose Estimation using Semantic Feedback to
Smart Edge Sensors
- URL: http://arxiv.org/abs/2106.14729v1
- Date: Mon, 28 Jun 2021 14:00:00 GMT
- Title: Real-Time Multi-View 3D Human Pose Estimation using Semantic Feedback to
Smart Edge Sensors
- Authors: Simon Bultmann and Sven Behnke
- Abstract summary: 2D joint detection for each camera view is performed locally on a dedicated embedded inference processor.
3D poses are recovered from 2D joints on a central backend, based on triangulation and a body model.
The whole pipeline is capable of real-time operation.
- Score: 28.502280038100167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel method for estimation of 3D human poses from a
multi-camera setup, employing distributed smart edge sensors coupled with a
backend through a semantic feedback loop. 2D joint detection for each camera
view is performed locally on a dedicated embedded inference processor. Only the
semantic skeleton representation is transmitted over the network and raw images
remain on the sensor board. 3D poses are recovered from 2D joints on a central
backend, based on triangulation and a body model which incorporates prior
knowledge of the human skeleton. A feedback channel from backend to individual
sensors is implemented on a semantic level. The allocentric 3D pose is
backprojected into the sensor views where it is fused with 2D joint detections.
The local semantic model on each sensor can thus be improved by incorporating
global context information. The whole pipeline is capable of real-time
operation. We evaluate our method on three public datasets, where we achieve
state-of-the-art results and show the benefits of our feedback architecture, as
well as in our own setup for multi-person experiments. Using the feedback
signal improves the 2D joint detections and in turn the estimated 3D poses.
Related papers
- SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning [17.99904937160487]
We introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning.
SCIPaD achieves a reduction of 22.2% in average translation error and 34.8% in average angular error for camera pose estimation task on the KITTI Odometry dataset.
arXiv Detail & Related papers (2024-07-07T06:52:51Z) - UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - Geometry-Biased Transformer for Robust Multi-View 3D Human Pose
Reconstruction [3.069335774032178]
We propose a novel encoder-decoder Transformer architecture to estimate 3D poses from multi-view 2D pose sequences.
We conduct experiments on three benchmark public datasets, Human3.6M, CMU Panoptic and Occlusion-Persons.
arXiv Detail & Related papers (2023-12-28T16:30:05Z) - Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence
Learning [70.75369367311897]
3D-aware global correspondences are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies.
An adversarial generator takes the garment warped by the 3D-aware flow, and the image of the target person as inputs, to synthesize the photo-realistic try-on result.
arXiv Detail & Related papers (2022-11-25T12:16:21Z) - 3D Human Pose Estimation in Multi-View Operating Room Videos Using
Differentiable Camera Projections [2.486571221735935]
We propose to directly optimise for localisation in 3D by training 2D CNNs end-to-end based on a 3D loss.
Using videos from the MVOR dataset, we show that this end-to-end approach outperforms optimisation in 2D space.
arXiv Detail & Related papers (2022-10-21T09:00:02Z) - Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z) - 3D Semantic Scene Perception using Distributed Smart Edge Sensors [29.998917158604694]
We present a system for 3D semantic scene perception consisting of a network of distributed smart edge sensors.
The sensor nodes are based on an embedded CNN inference accelerator and RGB-D and thermal cameras.
The proposed perception system provides a complete scene view containing semantically annotated 3D geometry and estimates 3D poses of multiple persons in real time.
arXiv Detail & Related papers (2022-05-03T12:46:26Z) - VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the
Wild [98.69191256693703]
We present VoxelTrack for multi-person 3D pose estimation and tracking from a few cameras which are separated by wide baselines.
It employs a multi-branch network to jointly estimate 3D poses and re-identification (Re-ID) features for all people in the environment.
It outperforms the state-of-the-art methods by a large margin on three public datasets including Shelf, Campus and CMU Panoptic.
arXiv Detail & Related papers (2021-08-05T08:35:44Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - Learning 3D Human Shape and Pose from Dense Body Parts [117.46290013548533]
We propose a Decompose-and-aggregate Network (DaNet) to learn 3D human shape and pose from dense correspondences of body parts.
Messages from local streams are aggregated to enhance the robust prediction of the rotation-based poses.
Our method is validated on both indoor and real-world datasets including Human3.6M, UP3D, COCO, and 3DPW.
arXiv Detail & Related papers (2019-12-31T15:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.