Related papers: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation

Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation

URL: http://arxiv.org/abs/2512.06306v1
Date: Sat, 06 Dec 2025 05:32:13 GMT
Title: Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation
Authors: Haoxian Zhou, Chuanzhi Xu, Langyi Chen, Haodong Chen, Yuk Ying Chung, Qiang Qu, Xaoming Chen, Weidong Cai,
Abstract summary: Event cameras provide high temporal resolution and low latency, enabling robust estimation under challenging conditions.<n>Most existing methods convert event streams into dense event frames, which sacrifices high temporal resolution of the event signal.<n>In this work, we aim to exploit event streams on point-based framework designed to enhance human pose estimation performance.
Score: 15.899725453972787
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Human pose estimation focuses on predicting body keypoints to analyze human motion. Event cameras provide high temporal resolution and low latency, enabling robust estimation under challenging conditions. However, most existing methods convert event streams into dense event frames, which adds extra computation and sacrifices the high temporal resolution of the event signal. In this work, we aim to exploit the spatiotemporal properties of event streams based on point cloud-based framework, designed to enhance human pose estimation performance. We design Event Temporal Slicing Convolution module to capture short-term dependencies across event slices, and combine it with Event Slice Sequencing module for structured temporal modeling. We also apply edge enhancement in point cloud-based event representation to enhance spatial edge information under sparse event conditions to further improve performance. Experiments on the DHP19 dataset show our proposed method consistently improves performance across three representative point cloud backbones: PointNet, DGCNN, and Point Transformer.

Related papers

Event-based Visual Deformation Measurement [76.25283405575108]
Visual Deformation Measurement aims to recover dense deformation fields by tracking surface motion from camera observations.<n>Traditional image-based methods rely on minimal inter-frame motion to constrain the correspondence search space.<n>We propose an event-frame fusion framework that exploits events for temporally dense motion cues and frames for spatially dense precise estimation.
arXiv Detail & Related papers (2026-02-16T01:04:48Z)
EventSTU: Event-Guided Efficient Spatio-Temporal Understanding for Video Large Language Models [56.16721798968254]
We propose an event-guided, training-free framework for efficient understanding, named EventSTU.<n>In the temporal domain, we design a coarse-to-fine sampling algorithm that the change-triggered property of event cameras to eliminate redundant large frames.<n>In the spatial domain, we achieves an adaptive token pruning algorithm that leverages the saliency of events as a zero-cost prior to guide spatial reduction.
arXiv Detail & Related papers (2025-11-24T09:30:02Z)
Hybrid Spiking Vision Transformer for Object Detection with Event Cameras [19.967565219584056]
Spiking Neural Networks (SNNs) have emerged as a promising approach, offering low energy consumption and rich dynamics.<n>This study proposes a novel hybrid Transformer (HsVT) model to enhance the performance of event-based object detection.<n> Experimental results demonstrate that HsVT achieves significant performance improvements in event detection with fewer parameters.
arXiv Detail & Related papers (2025-05-12T16:19:20Z)
EMoTive: Event-guided Trajectory Modeling for 3D Motion Estimation [59.33052312107478]
Event cameras offer possibilities for 3D motion estimation through continuous adaptive pixel-level responses to scene changes.<n>This paper presents EMove, a novel event-based framework that models-uniform trajectories via event-guided parametric curves.<n>For motion representation, we introduce a density-aware adaptation mechanism to fuse spatial and temporal features under event guidance.<n>The final 3D motion estimation is achieved through multi-temporal sampling of parametric trajectories, flows and depth motion fields.
arXiv Detail & Related papers (2025-03-14T13:15:54Z)
Frequency-aware Event Cloud Network [22.41905416371072]
We propose a frequency-aware network named FECNet that leverages Event Cloud representations.<n>FECNet fully utilizes 2S-1T-1P Event Cloud by innovating the event-based Group and Sampling module.<n>We conducted extensive experiments on event-based object classification, action recognition, and human pose estimation tasks.
arXiv Detail & Related papers (2024-12-30T08:53:57Z)
Labits: Layered Bidirectional Time Surfaces Representation for Event Camera-based Continuous Dense Trajectory Estimation [1.3416369506987165]
Event cameras capture dynamic scenes with high temporal resolution and low latency.<n>We introduce Labits: Layered Bidirectional Time Surfaces, a simple yet elegant representation designed to retain all these features.<n>Our approach achieves an impressive 49% reduction in trajectory end-point error (TEPE) compared to the previous state-of-the-art on the MultiFlow dataset.
arXiv Detail & Related papers (2024-12-12T01:11:50Z)
Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba [11.400397931501338]
Event cameras draw inspiration from biological systems, boasting low latency and high dynamic range while consuming minimal power.<n>Most current approach to processing Event Cloud often involves converting it into frame-based representations.<n>We propose EventMamba, an efficient and effective framework based on Point Cloud representation.
arXiv Detail & Related papers (2024-05-09T21:47:46Z)
Scalable Event-by-event Processing of Neuromorphic Sensory Signals With Deep State-Space Models [2.551844666707809]
Event-based sensors are well suited for real-time processing. Current methods either collapse events into frames or cannot scale up when processing the event data directly event-by-event.
arXiv Detail & Related papers (2024-04-29T08:50:27Z)
MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking [50.26836546224782]
Event-based eye tracking has shown great promise with the high temporal resolution and low redundancy. The diversity and abruptness of eye movement patterns, including blinking, fixating, saccades, and smooth pursuit, pose significant challenges for eye localization. This paper proposes a bidirectional long-term sequence modeling and time-varying state selection mechanism to fully utilize contextual temporal information.
arXiv Detail & Related papers (2024-04-18T11:09:25Z)
Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner. Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation. Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z)
Asynchronous Optimisation for Event-based Visual Odometry [53.59879499700895]
Event cameras open up new possibilities for robotic perception due to their low latency and high dynamic range. We focus on event-based visual odometry (VO) We propose an asynchronous structure-from-motion optimisation back-end.
arXiv Detail & Related papers (2022-03-02T11:28:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.