EventHPE: Event-based 3D Human Pose and Shape Estimation
- URL: http://arxiv.org/abs/2108.06819v1
- Date: Sun, 15 Aug 2021 21:40:19 GMT
- Title: EventHPE: Event-based 3D Human Pose and Shape Estimation
- Authors: Shihao Zou, Chuan Guo, Xinxin Zuo, Sen Wang, Pengyu Wang, Xiaoqin Hu,
Shoushun Chen, Minglun Gong, Li Cheng
- Abstract summary: Event camera is an emerging imaging sensor for capturing dynamics of moving objects as events.
We propose a two-stage deep learning approach, called EventHPE.
The first-stage, FlowNet, is trained by unsupervised learning to infer optical flow from events.
The second-stage, ShapeNet, is fed as input to the ShapeNet in the second stage to estimate 3D human shapes.
- Score: 33.197194879047956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Event camera is an emerging imaging sensor for capturing dynamics of moving
objects as events, which motivates our work in estimating 3D human pose and
shape from the event signals. Events, on the other hand, have their unique
challenges: rather than capturing static body postures, the event signals are
best at capturing local motions. This leads us to propose a two-stage deep
learning approach, called EventHPE. The first-stage, FlowNet, is trained by
unsupervised learning to infer optical flow from events. Both events and
optical flow are closely related to human body dynamics, which are fed as input
to the ShapeNet in the second stage, to estimate 3D human shapes. To mitigate
the discrepancy between image-based flow (optical flow) and shape-based flow
(vertices movement of human body shape), a novel flow coherence loss is
introduced by exploiting the fact that both flows are originated from the
identical human motion. An in-house event-based 3D human dataset is curated
that comes with 3D pose and shape annotations, which is by far the largest one
to our knowledge. Empirical evaluations on DHP19 dataset and our in-house
dataset demonstrate the effectiveness of our approach.
Related papers
- EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams [59.77837807004765]
This paper introduces a new problem, i.e., 3D human motion capture from an egocentric monocular event camera with a fisheye lens.
Event streams have high temporal resolution and provide reliable cues for 3D human motion capture under high-speed human motions and rapidly changing illumination.
Our EE3D demonstrates robustness and superior 3D accuracy compared to existing solutions while supporting real-time 3D pose update rates of 140Hz.
arXiv Detail & Related papers (2024-04-12T17:59:47Z) - 3D Human Scan With A Moving Event Camera [7.734104968315144]
Event cameras have the advantages of high temporal resolution and high dynamic range.
This paper proposes a novel event-based method for 3D pose estimation and human mesh recovery.
arXiv Detail & Related papers (2024-04-12T14:34:24Z) - Exploring Event-based Human Pose Estimation with 3D Event Representations [26.34100847541989]
We introduce two 3D event representations: the Rasterized Event Point Cloud (Ras EPC) and the Decoupled Event Voxel (DEV)
The Ras EPC aggregates events within concise temporal slices at identical positions, preserving their 3D attributes along with statistical information, thereby significantly reducing memory and computational demands.
Our methods are tested on the DHP19 public dataset, MMHPSD dataset, and our EV-3DPW dataset, with further qualitative validation via a derived driving scene dataset EV-JAAD and an outdoor collection vehicle.
arXiv Detail & Related papers (2023-11-08T10:45:09Z) - Event-based Human Pose Tracking by Spiking Spatiotemporal Transformer [20.188995900488717]
We present a dedicated end-to-end sparse deep approach for event-based pose tracking.
This is the first time that 3D human pose tracking is obtained from events only.
Our approach also achieves a significant reduction of 80% in FLOPS.
arXiv Detail & Related papers (2023-03-16T22:56:12Z) - Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence
Learning [70.75369367311897]
3D-aware global correspondences are reliable flows that jointly encode global semantic correlations, local deformations, and geometric priors of 3D human bodies.
An adversarial generator takes the garment warped by the 3D-aware flow, and the image of the target person as inputs, to synthesize the photo-realistic try-on result.
arXiv Detail & Related papers (2022-11-25T12:16:21Z) - NeuralReshaper: Single-image Human-body Retouching with Deep Neural
Networks [50.40798258968408]
We present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.
Our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image.
To deal with the lack-of-data problem that no paired data exist, we introduce a novel self-supervised strategy to train our network.
arXiv Detail & Related papers (2022-03-20T09:02:13Z) - LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human
Bodies [78.17425779503047]
We propose a novel neural implicit representation for the human body.
It is fully differentiable and optimizable with disentangled shape and pose latent spaces.
Our model can be trained and fine-tuned directly on non-watertight raw data with well-designed losses.
arXiv Detail & Related papers (2021-11-30T04:10:57Z) - 3D Human Pose Estimation for Free-form Activity Using WiFi Signals [5.2245900672091]
Winect is a 3D human pose tracking system for free-form activity using commodity WiFi devices.
Our system tracks free-form activity by estimating a 3D skeleton pose that consists of a set of joints of the human body.
arXiv Detail & Related papers (2021-10-15T18:47:16Z) - Differentiable Event Stream Simulator for Non-Rigid 3D Tracking [82.56690776283428]
Our differentiable simulator enables non-rigid 3D tracking of deformable objects from event streams.
We show the effectiveness of our approach for various types of non-rigid objects and compare to existing methods for non-rigid 3D tracking.
arXiv Detail & Related papers (2021-04-30T17:58:07Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z) - Optical Non-Line-of-Sight Physics-based 3D Human Pose Estimation [38.57899581285387]
We describe a method for 3D human pose estimation from transient images.
Our method can perceive 3D human pose by looking around corners' through the use of light indirectly reflected by the environment.
arXiv Detail & Related papers (2020-03-31T17:57:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.