E$^3$Pose: Energy-Efficient Edge-assisted Multi-camera System for
Multi-human 3D Pose Estimation
- URL: http://arxiv.org/abs/2301.09015v1
- Date: Sat, 21 Jan 2023 21:53:33 GMT
- Title: E$^3$Pose: Energy-Efficient Edge-assisted Multi-camera System for
Multi-human 3D Pose Estimation
- Authors: Letian Zhang, Jie Xu
- Abstract summary: Multi-human 3D pose estimation plays a key role in establishing a seamless connection between the real world and the virtual world.
We propose an energy-efficient edge-assisted multiple-camera system, dubbed E$3$Pose, for real-time multi-human 3D pose estimation.
Our results show that a significant energy saving (up to 31.21%) can be achieved while maintaining a high 3D pose estimation accuracy comparable to state-of-the-art methods.
- Score: 5.50767672740241
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-human 3D pose estimation plays a key role in establishing a seamless
connection between the real world and the virtual world. Recent efforts adopted
a two-stage framework that first builds 2D pose estimations in multiple camera
views from different perspectives and then synthesizes them into 3D poses.
However, the focus has largely been on developing new computer vision
algorithms on the offline video datasets without much consideration on the
energy constraints in real-world systems with flexibly-deployed and
battery-powered cameras. In this paper, we propose an energy-efficient
edge-assisted multiple-camera system, dubbed E$^3$Pose, for real-time
multi-human 3D pose estimation, based on the key idea of adaptive camera
selection. Instead of always employing all available cameras to perform 2D pose
estimations as in the existing works, E$^3$Pose selects only a subset of
cameras depending on their camera view qualities in terms of occlusion and
energy states in an adaptive manner, thereby reducing the energy consumption
(which translates to extended battery lifetime) and improving the estimation
accuracy. To achieve this goal, E$^3$Pose incorporates an attention-based LSTM
to predict the occlusion information of each camera view and guide camera
selection before cameras are selected to process the images of a scene, and
runs a camera selection algorithm based on the Lyapunov optimization framework
to make long-term adaptive selection decisions. We build a prototype of
E$^3$Pose on a 5-camera testbed, demonstrate its feasibility and evaluate its
performance. Our results show that a significant energy saving (up to 31.21%)
can be achieved while maintaining a high 3D pose estimation accuracy comparable
to state-of-the-art methods.
Related papers
- Self-learning Canonical Space for Multi-view 3D Human Pose Estimation [57.969696744428475]
Multi-view 3D human pose estimation is naturally superior to single view one.
The accurate annotation of these information is hard to obtain.
We propose a fully self-supervised framework, named cascaded multi-view aggregating network (CMANet)
CMANet is superior to state-of-the-art methods in extensive quantitative and qualitative analysis.
arXiv Detail & Related papers (2024-03-19T04:54:59Z) - Improving Real-Time Omnidirectional 3D Multi-Person Human Pose Estimation with People Matching and Unsupervised 2D-3D Lifting [3.231937990387248]
Current human pose estimation systems focus on retrieving an accurate 3D global estimate of a single person.
This paper presents one of the first 3D multi-person human pose estimation systems that is able to work in real-time.
arXiv Detail & Related papers (2024-03-14T14:30:31Z) - Scene-Aware 3D Multi-Human Motion Capture from a Single Camera [83.06768487435818]
We consider the problem of estimating the 3D position of multiple humans in a scene as well as their body shape and articulation from a single RGB video recorded with a static camera.
We leverage recent advances in computer vision using large-scale pre-trained models for a variety of modalities, including 2D body joints, joint angles, normalized disparity maps, and human segmentation masks.
In particular, we estimate the scene depth and unique person scale from normalized disparity predictions using the 2D body joints and joint angles.
arXiv Detail & Related papers (2023-01-12T18:01:28Z) - 3D Human Pose Estimation in Multi-View Operating Room Videos Using
Differentiable Camera Projections [2.486571221735935]
We propose to directly optimise for localisation in 3D by training 2D CNNs end-to-end based on a 3D loss.
Using videos from the MVOR dataset, we show that this end-to-end approach outperforms optimisation in 2D space.
arXiv Detail & Related papers (2022-10-21T09:00:02Z) - Multi-modal 3D Human Pose Estimation with 2D Weak Supervision in
Autonomous Driving [74.74519047735916]
3D human pose estimation (HPE) in autonomous vehicles (AV) differs from other use cases in many factors.
Data collected for other use cases (such as virtual reality, gaming, and animation) may not be usable for AV applications.
We propose one of the first approaches to alleviate this problem in the AV setting.
arXiv Detail & Related papers (2021-12-22T18:57:16Z) - MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision [72.5863451123577]
We show how to train a neural model that can perform accurate 3D pose and camera estimation.
Our method outperforms both classical bundle adjustment and weakly-supervised monocular 3D baselines.
arXiv Detail & Related papers (2021-08-10T18:39:56Z) - Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views.
We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z) - Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated
Convolution [34.301501457959056]
We propose a temporal regression network with a gated convolution module to transform 2D joints to 3D.
A simple yet effective localization approach is also conducted to transform the normalized pose to the global trajectory.
Our proposed method outperforms most state-of-the-art 2D-to-3D pose estimation methods.
arXiv Detail & Related papers (2020-10-31T04:35:24Z) - VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild
Environment [80.77351380961264]
We present an approach to estimate 3D poses of multiple people from multiple camera views.
We present an end-to-end solution which operates in the $3$D space, therefore avoids making incorrect decisions in the 2D space.
We propose Pose Regression Network (PRN) to estimate a detailed 3D pose for each proposal.
arXiv Detail & Related papers (2020-04-13T23:50:01Z) - Synergetic Reconstruction from 2D Pose and 3D Motion for Wide-Space
Multi-Person Video Motion Capture in the Wild [3.0015034534260665]
We propose a markerless motion capture method with accuracy and smoothness from multiple cameras.
The proposed method predicts each persons 3D pose and determines bounding box of multi-camera images.
We evaluated the proposed method using various datasets and a real sports field.
arXiv Detail & Related papers (2020-01-16T02:14:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.