3D Human Scan With A Moving Event Camera
- URL: http://arxiv.org/abs/2404.08504v2
- Date: Tue, 16 Apr 2024 10:18:56 GMT
- Title: 3D Human Scan With A Moving Event Camera
- Authors: Kai Kohyama, Shintaro Shiba, Yoshimitsu Aoki,
- Abstract summary: Event cameras have the advantages of high temporal resolution and high dynamic range.
This paper proposes a novel event-based method for 3D pose estimation and human mesh recovery.
- Score: 7.734104968315144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Capturing a 3D human body is one of the important tasks in computer vision with a wide range of applications such as virtual reality and sports analysis. However, conventional frame cameras are limited by their temporal resolution and dynamic range, which imposes constraints in real-world application setups. Event cameras have the advantages of high temporal resolution and high dynamic range (HDR), but the development of event-based methods is necessary to handle data with different characteristics. This paper proposes a novel event-based method for 3D pose estimation and human mesh recovery. Prior work on event-based human mesh recovery require frames (images) as well as event data. The proposed method solely relies on events; it carves 3D voxels by moving the event camera around a stationary body, reconstructs the human pose and mesh by attenuated rays, and fit statistical body models, preserving high-frequency details. The experimental results show that the proposed method outperforms conventional frame-based methods in the estimation accuracy of both pose and body mesh. We also demonstrate results in challenging situations where a conventional camera has motion blur. This is the first to demonstrate event-only human mesh recovery, and we hope that it is the first step toward achieving robust and accurate 3D human body scanning from vision sensors. https://florpeng.github.io/event-based-human-scan/
Related papers
- EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams [59.77837807004765]
This paper introduces a new problem, i.e., 3D human motion capture from an egocentric monocular event camera with a fisheye lens.
Event streams have high temporal resolution and provide reliable cues for 3D human motion capture under high-speed human motions and rapidly changing illumination.
Our EE3D demonstrates robustness and superior 3D accuracy compared to existing solutions while supporting real-time 3D pose update rates of 140Hz.
arXiv Detail & Related papers (2024-04-12T17:59:47Z) - Event-based tracking of human hands [0.6875312133832077]
Event camera detects changes in brightness, measuring motion, with low latency, no motion blur, low power consumption and high dynamic range.
Captured frames are analysed using lightweight algorithms reporting 3D hand position data.
arXiv Detail & Related papers (2023-04-13T13:43:45Z) - Event-based Human Pose Tracking by Spiking Spatiotemporal Transformer [20.188995900488717]
We present a dedicated end-to-end sparse deep approach for event-based pose tracking.
This is the first time that 3D human pose tracking is obtained from events only.
Our approach also achieves a significant reduction of 80% in FLOPS.
arXiv Detail & Related papers (2023-03-16T22:56:12Z) - Human Performance Capture from Monocular Video in the Wild [50.34917313325813]
We propose a method capable of capturing the dynamic 3D human shape from a monocular video featuring challenging body poses.
Our method outperforms state-of-the-art methods on an in-the-wild human video dataset 3DPW.
arXiv Detail & Related papers (2021-11-29T16:32:41Z) - EventHPE: Event-based 3D Human Pose and Shape Estimation [33.197194879047956]
Event camera is an emerging imaging sensor for capturing dynamics of moving objects as events.
We propose a two-stage deep learning approach, called EventHPE.
The first-stage, FlowNet, is trained by unsupervised learning to infer optical flow from events.
The second-stage, ShapeNet, is fed as input to the ShapeNet in the second stage to estimate 3D human shapes.
arXiv Detail & Related papers (2021-08-15T21:40:19Z) - Lifting Monocular Events to 3D Human Poses [22.699272716854967]
This paper presents a novel 3D human pose estimation approach using a single stream of asynchronous events as input.
We propose the first learning-based method for 3D human pose from a single stream of events.
Experiments demonstrate that our method achieves solid accuracy, narrowing the performance gap between standard RGB and event-based vision.
arXiv Detail & Related papers (2021-04-21T16:07:12Z) - Human POSEitioning System (HPS): 3D Human Pose Estimation and
Self-localization in Large Scenes from Body-Mounted Sensors [71.29186299435423]
We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment.
We show that our optimization-based integration exploits the benefits of the two, resulting in pose accuracy free of drift.
HPS could be used for VR/AR applications where humans interact with the scene without requiring direct line of sight with an external camera.
arXiv Detail & Related papers (2021-03-31T17:58:31Z) - EventHands: Real-Time Neural 3D Hand Reconstruction from an Event Stream [80.15360180192175]
3D hand pose estimation from monocular videos is a long-standing and challenging problem.
We address it for the first time using a single event camera, i.e., an asynchronous vision sensor reacting on brightness changes.
Our approach has characteristics previously not demonstrated with a single RGB or depth camera.
arXiv Detail & Related papers (2020-12-11T16:45:34Z) - PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time [89.68248627276955]
Marker-less 3D motion capture from a single colour camera has seen significant progress.
However, it is a very challenging and severely ill-posed problem.
We present PhysCap, the first algorithm for physically plausible, real-time and marker-less human 3D motion capture.
arXiv Detail & Related papers (2020-08-20T10:46:32Z) - Contact and Human Dynamics from Monocular Video [73.47466545178396]
Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors.
We present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input.
arXiv Detail & Related papers (2020-07-22T21:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.