Rethinking Event-based Human Pose Estimation with 3D Event
Representations
- URL: http://arxiv.org/abs/2311.04591v3
- Date: Fri, 1 Dec 2023 07:26:35 GMT
- Title: Rethinking Event-based Human Pose Estimation with 3D Event
Representations
- Authors: Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni,
Kailun Yang, Kaiwei Wang
- Abstract summary: Event cameras offer a robust solution for navigating challenging contexts.
We introduce two 3D event representations: the Rasterized Event Point Cloud and the Decoupled Event Voxel.
Experiments on EV-3DPW demonstrate that the robustness of our proposed 3D representation methods compared to traditional RGB images and event frame techniques.
- Score: 26.592295349210787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human pose estimation is a fundamental and appealing task in computer vision.
Traditional frame-based cameras and videos are commonly applied, yet, they
become less reliable in scenarios under high dynamic range or heavy motion
blur. In contrast, event cameras offer a robust solution for navigating these
challenging contexts. Predominant methodologies incorporate event cameras into
learning frameworks by accumulating events into event frames. However, such
methods tend to marginalize the intrinsic asynchronous and high temporal
resolution characteristics of events. This disregard leads to a loss in
essential temporal dimension data, crucial for discerning distinct actions. To
address this issue and to unlock the 3D potential of event information, we
introduce two 3D event representations: the Rasterized Event Point Cloud
(RasEPC) and the Decoupled Event Voxel (DEV). The RasEPC collates events within
concise temporal slices at identical positions, preserving 3D attributes with
statistical cues and markedly mitigating memory and computational demands.
Meanwhile, the DEV representation discretizes events into voxels and projects
them across three orthogonal planes, utilizing decoupled event attention to
retrieve 3D cues from the 2D planes. Furthermore, we develop and release
EV-3DPW, a synthetic event-based dataset crafted to facilitate training and
quantitative analysis in outdoor scenes. On the public real-world DHP19
dataset, our event point cloud technique excels in real-time mobile
predictions, while the decoupled event voxel method achieves the highest
accuracy. Experiments on EV-3DPW demonstrate that the robustness of our
proposed 3D representation methods compared to traditional RGB images and event
frame techniques under the same backbones. Our code and dataset have been made
publicly available at https://github.com/MasterHow/EventPointPose.
Related papers
- EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous
Visual Hulls [46.94040300725127]
3D reconstruction from multiple views is a successful computer vision field with multiple deployments in applications.
We study the problem of 3D reconstruction from event-cameras, motivated by the advantages of event-based cameras in terms of low power and latency.
We propose Apparent Contour Events (ACE), a novel event-based representation that defines the geometry of the apparent contour of an object.
arXiv Detail & Related papers (2023-04-11T15:46:16Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Event-based Human Pose Tracking by Spiking Spatiotemporal Transformer [20.188995900488717]
We present a dedicated end-to-end sparse deep approach for event-based pose tracking.
This is the first time that 3D human pose tracking is obtained from events only.
Our approach also achieves a significant reduction of 80% in FLOPS.
arXiv Detail & Related papers (2023-03-16T22:56:12Z) - EventNeRF: Neural Radiance Fields from a Single Colour Event Camera [81.19234142730326]
This paper proposes the first approach for 3D-consistent, dense and novel view synthesis using just a single colour event stream as input.
At its core is a neural radiance field trained entirely in a self-supervised manner from events while preserving the original resolution of the colour event channels.
We evaluate our method qualitatively and numerically on several challenging synthetic and real scenes and show that it produces significantly denser and more visually appealing renderings.
arXiv Detail & Related papers (2022-06-23T17:59:53Z) - 3D-FlowNet: Event-based optical flow estimation with 3D representation [2.062593640149623]
Event-based cameras can overpass frame-based cameras limitations for important tasks such as high-speed motion detection.
Deep Neural Networks are not well adapted to work with event data as they are asynchronous and discrete.
We propose 3D-FlowNet, a novel network architecture that can process the 3D input representation and output optical flow estimations.
arXiv Detail & Related papers (2022-01-28T17:28:15Z) - Bridging the Gap between Events and Frames through Unsupervised Domain
Adaptation [57.22705137545853]
We propose a task transfer method that allows models to be trained directly with labeled images and unlabeled event data.
We leverage the generative event model to split event features into content and motion features.
Our approach unlocks the vast amount of existing image datasets for the training of event-based neural networks.
arXiv Detail & Related papers (2021-09-06T17:31:37Z) - Differentiable Event Stream Simulator for Non-Rigid 3D Tracking [82.56690776283428]
Our differentiable simulator enables non-rigid 3D tracking of deformable objects from event streams.
We show the effectiveness of our approach for various types of non-rigid objects and compare to existing methods for non-rigid 3D tracking.
arXiv Detail & Related papers (2021-04-30T17:58:07Z) - Lifting Monocular Events to 3D Human Poses [22.699272716854967]
This paper presents a novel 3D human pose estimation approach using a single stream of asynchronous events as input.
We propose the first learning-based method for 3D human pose from a single stream of events.
Experiments demonstrate that our method achieves solid accuracy, narrowing the performance gap between standard RGB and event-based vision.
arXiv Detail & Related papers (2021-04-21T16:07:12Z) - EventHands: Real-Time Neural 3D Hand Reconstruction from an Event Stream [80.15360180192175]
3D hand pose estimation from monocular videos is a long-standing and challenging problem.
We address it for the first time using a single event camera, i.e., an asynchronous vision sensor reacting on brightness changes.
Our approach has characteristics previously not demonstrated with a single RGB or depth camera.
arXiv Detail & Related papers (2020-12-11T16:45:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.