Fast Real-Time Pipeline for Robust Arm Gesture Recognition
- URL: http://arxiv.org/abs/2509.25042v1
- Date: Mon, 29 Sep 2025 16:57:56 GMT
- Title: Fast Real-Time Pipeline for Robust Arm Gesture Recognition
- Authors: Milán Zsolt Bagladi, László Gulyás, Gergő Szalay,
- Abstract summary: This paper presents a real-time pipeline for dynamic arm gesture recognition based on OpenPose keypoint estimation.<n> Experiments on a custom traffic-control gesture dataset demonstrate high accuracy across varying viewing angles and speeds.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a real-time pipeline for dynamic arm gesture recognition based on OpenPose keypoint estimation, keypoint normalization, and a recurrent neural network classifier. The 1 x 1 normalization scheme and two feature representations (coordinate- and angle-based) are presented for the pipeline. In addition, an efficient method to improve robustness against camera angle variations is also introduced by using artificially rotated training data. Experiments on a custom traffic-control gesture dataset demonstrate high accuracy across varying viewing angles and speeds. Finally, an approach to calculate the speed of the arm signal (if necessary) is also presented.
Related papers
- Towards Arbitrary Motion Completing via Hierarchical Continuous Representation [64.6525112550758]
We propose a novel parametric activation-induced hierarchical implicit representation framework, called NAME, based on Implicit Representations (INRs)<n>Our method introduces a hierarchical temporal encoding mechanism that extracts features from motion sequences at multiple temporal scales, enabling effective capture of intricate temporal patterns.
arXiv Detail & Related papers (2025-12-24T14:07:04Z) - Temporal and Rotational Calibration for Event-Centric Multi-Sensor Systems [24.110040599070796]
Event cameras generate asynchronous signals in response to pixel-level brightness changes.<n>We propose a motion-based temporal and rotational calibration framework tailored for event-centric multi-sensor systems.
arXiv Detail & Related papers (2025-08-18T01:53:27Z) - A Linear N-Point Solver for Structure and Motion from Asynchronous Tracks [31.081278354577893]
Structure and continuous motion estimation from point correspondences is a fundamental problem in computer vision.<n>We present a unified approach for structure and linear motion estimation from 2D point correspondences with arbitrary timestamps.
arXiv Detail & Related papers (2025-07-30T14:53:46Z) - Robust and Real-time Surface Normal Estimation from Stereo Disparities using Affine Transformations [6.322193856514675]
This work introduces a novel method for surface normal estimation from rectified stereo image pairs.<n>We develop a custom algorithm inspired by convolutional operations, tailored to process disparity data efficiently.<n>Our method is validated using both simulated environments and real-world stereo images from the Middlebury and Cityscapes datasets.
arXiv Detail & Related papers (2025-04-21T14:19:00Z) - MATE: Motion-Augmented Temporal Consistency for Event-based Point Tracking [58.719310295870024]
This paper presents an event-based framework for tracking any point.<n>To resolve ambiguities caused by event sparsity, a motion-guidance module incorporates kinematic vectors into the local matching process.<n>The method improves the $Survival_50$ metric by 17.9% over event-only tracking of any point baseline.
arXiv Detail & Related papers (2024-12-02T09:13:29Z) - Event-Aided Time-to-Collision Estimation for Autonomous Driving [28.13397992839372]
We present a novel method that estimates the time to collision using a neuromorphic event-based camera.
The proposed algorithm consists of a two-step approach for efficient and accurate geometric model fitting on event data.
Experiments on both synthetic and real data demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2024-07-10T02:37:36Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Correlating sparse sensing for large-scale traffic speed estimation: A
Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging.
We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv Detail & Related papers (2022-10-21T07:25:57Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - ZippyPoint: Fast Interest Point Detection, Description, and Matching
through Mixed Precision Discretization [71.91942002659795]
We investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms.
ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size.
These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization.
arXiv Detail & Related papers (2022-03-07T18:59:03Z) - Real-time Pose and Shape Reconstruction of Two Interacting Hands With a
Single Depth Camera [79.41374930171469]
We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands.
Our approach combines an extensive list of favorable properties, namely it is marker-less.
We show state-of-the-art results in scenes that exceed the complexity level demonstrated by previous work.
arXiv Detail & Related papers (2021-06-15T11:39:49Z) - End-to-end Learning for Inter-Vehicle Distance and Relative Velocity
Estimation in ADAS with a Monocular Camera [81.66569124029313]
We propose a camera-based inter-vehicle distance and relative velocity estimation method based on end-to-end training of a deep neural network.
The key novelty of our method is the integration of multiple visual clues provided by any two time-consecutive monocular frames.
We also propose a vehicle-centric sampling mechanism to alleviate the effect of perspective distortion in the motion field.
arXiv Detail & Related papers (2020-06-07T08:18:31Z) - Vanishing Point Detection with Direct and Transposed Fast Hough
Transform inside the neural network [0.0]
In this paper, we suggest a new neural network architecture for vanishing point detection in images.
The key element is the use of the direct and transposed Fast Hough Transforms separated by convolutional layer blocks with standard activation functions.
arXiv Detail & Related papers (2020-02-04T09:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.