Related papers: Rethinking Event-based Human Pose Estimation with 3D Event Representations

Rethinking Event-based Human Pose Estimation with 3D Event Representations

URL: http://arxiv.org/abs/2311.04591v3
Date: Fri, 1 Dec 2023 07:26:35 GMT
Title: Rethinking Event-based Human Pose Estimation with 3D Event Representations
Authors: Xiaoting Yin, Hao Shi, Jiaan Chen, Ze Wang, Yaozu Ye, Huajian Ni, Kailun Yang, Kaiwei Wang
Abstract summary: Event cameras offer a robust solution for navigating challenging contexts. We introduce two 3D event representations: the Rasterized Event Point Cloud and the Decoupled Event Voxel. Experiments on EV-3DPW demonstrate that the robustness of our proposed 3D representation methods compared to traditional RGB images and event frame techniques.
Score: 26.592295349210787
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human pose estimation is a fundamental and appealing task in computer vision. Traditional frame-based cameras and videos are commonly applied, yet, they become less reliable in scenarios under high dynamic range or heavy motion blur. In contrast, event cameras offer a robust solution for navigating these challenging contexts. Predominant methodologies incorporate event cameras into learning frameworks by accumulating events into event frames. However, such methods tend to marginalize the intrinsic asynchronous and high temporal resolution characteristics of events. This disregard leads to a loss in essential temporal dimension data, crucial for discerning distinct actions. To address this issue and to unlock the 3D potential of event information, we introduce two 3D event representations: the Rasterized Event Point Cloud (RasEPC) and the Decoupled Event Voxel (DEV). The RasEPC collates events within concise temporal slices at identical positions, preserving 3D attributes with statistical cues and markedly mitigating memory and computational demands. Meanwhile, the DEV representation discretizes events into voxels and projects them across three orthogonal planes, utilizing decoupled event attention to retrieve 3D cues from the 2D planes. Furthermore, we develop and release EV-3DPW, a synthetic event-based dataset crafted to facilitate training and quantitative analysis in outdoor scenes. On the public real-world DHP19 dataset, our event point cloud technique excels in real-time mobile predictions, while the decoupled event voxel method achieves the highest accuracy. Experiments on EV-3DPW demonstrate that the robustness of our proposed 3D representation methods compared to traditional RGB images and event frame techniques under the same backbones. Our code and dataset have been made publicly available at https://github.com/MasterHow/EventPointPose.

Related papers

EventEgo3D++: 3D Human Motion Capture from a Head-Mounted Event Camera [64.58147600753382]
EventEgo3D++ is a monocular event camera with a fisheye lens for 3D human motion capture. Event cameras excel in high-speed scenarios and varying illumination due to their high temporal resolution. Our method supports real-time 3D pose updates at a rate of 140Hz.
arXiv Detail & Related papers (2025-02-11T18:57:05Z)
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting [76.02450110026747]
Event cameras, inspired by biological vision, record pixel-wise intensity changes asynchronously with high temporal resolution. We propose Event-Aided Free-Trajectory 3DGS, which seamlessly integrates the advantages of event cameras into 3DGS. We evaluate our method on the public Tanks and Temples benchmark and a newly collected real-world dataset, RealEv-DAVIS.
arXiv Detail & Related papers (2024-10-20T13:44:24Z)
Line-based 6-DoF Object Pose Estimation and Tracking With an Event Camera [19.204896246140155]
Event cameras possess remarkable attributes such as high dynamic range, low latency, and resilience against motion blur. We propose a line-based robust pose estimation and tracking method for planar or non-planar objects using an event camera.
arXiv Detail & Related papers (2024-08-06T14:36:43Z)
EVI-SAM: Robust, Real-time, Tightly-coupled Event-Visual-Inertial State Estimation and 3D Dense Mapping [5.154689086578339]
We propose EVI-SAM to tackle the problem of 6 DoF pose tracking and 3D reconstruction using monocular event camera. A novel event-based hybrid tracking framework is designed to estimate the pose, leveraging the robustness of feature matching and the precision of direct alignment. To the best of our knowledge, this is the first non-learning work to realize event-based dense mapping.
arXiv Detail & Related papers (2023-12-19T07:39:45Z)
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images. We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image. We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z)
EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous Visual Hulls [46.94040300725127]
3D reconstruction from multiple views is a successful computer vision field with multiple deployments in applications. We study the problem of 3D reconstruction from event-cameras, motivated by the advantages of event-based cameras in terms of low power and latency. We propose Apparent Contour Events (ACE), a novel event-based representation that defines the geometry of the apparent contour of an object.
arXiv Detail & Related papers (2023-04-11T15:46:16Z)
Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner. Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation. Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z)
Event-based Human Pose Tracking by Spiking Spatiotemporal Transformer [20.188995900488717]
We present a dedicated end-to-end sparse deep approach for event-based pose tracking. This is the first time that 3D human pose tracking is obtained from events only. Our approach also achieves a significant reduction of 80% in FLOPS.
arXiv Detail & Related papers (2023-03-16T22:56:12Z)
3D-FlowNet: Event-based optical flow estimation with 3D representation [2.062593640149623]
Event-based cameras can overpass frame-based cameras limitations for important tasks such as high-speed motion detection. Deep Neural Networks are not well adapted to work with event data as they are asynchronous and discrete. We propose 3D-FlowNet, a novel network architecture that can process the 3D input representation and output optical flow estimations.
arXiv Detail & Related papers (2022-01-28T17:28:15Z)
Differentiable Event Stream Simulator for Non-Rigid 3D Tracking [82.56690776283428]
Our differentiable simulator enables non-rigid 3D tracking of deformable objects from event streams. We show the effectiveness of our approach for various types of non-rigid objects and compare to existing methods for non-rigid 3D tracking.
arXiv Detail & Related papers (2021-04-30T17:58:07Z)
Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving. We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.