Memory-Efficient, Limb Position-Aware Hand Gesture Recognition using
Hyperdimensional Computing
- URL: http://arxiv.org/abs/2103.05267v1
- Date: Tue, 9 Mar 2021 07:31:00 GMT
- Title: Memory-Efficient, Limb Position-Aware Hand Gesture Recognition using
Hyperdimensional Computing
- Authors: Andy Zhou, Rikky Muller, and Jan Rabaey
- Abstract summary: We present sensor fusion of accelerometer and EMG signals using a hyperdimensional computing model.
We obtain a classification accuracy of up to 93.34%, an improvement of 17.79% over using a model trained solely on EMG.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Electromyogram (EMG) pattern recognition can be used to classify hand
gestures and movements for human-machine interface and prosthetics
applications, but it often faces reliability issues resulting from limb
position change. One method to address this is dual-stage classification, in
which the limb position is first determined using additional sensors to select
between multiple position-specific gesture classifiers. While improving
performance, this also increases model complexity and memory footprint, making
a dual-stage classifier difficult to implement in a wearable device with
limited resources. In this paper, we present sensor fusion of accelerometer and
EMG signals using a hyperdimensional computing model to emulate dual-stage
classification in a memory-efficient way. We demonstrate two methods of
encoding accelerometer features to act as keys for retrieval of
position-specific parameters from multiple models stored in superposition.
Through validation on a dataset of 13 gestures in 8 limb positions, we obtain a
classification accuracy of up to 93.34%, an improvement of 17.79% over using a
model trained solely on EMG. We achieve this while only marginally increasing
memory footprint over a single limb position model, requiring $8\times$ less
memory than a traditional dual-stage classification architecture.
Related papers
- Leveraging Convolutional Sparse Autoencoders for Robust Movement Classification from Low-Density sEMG [0.46976113832881716]
This study proposes a deep learning framework for accurate gesture recognition using only two surface electromyography (sEMG) channels.<n>We present a few-shot transfer learning protocol that improved performance on unseen subjects from a baseline of 35.1% $pm$ 3.1% to 92.3% $pm$ 0.9% with minimal calibration data.
arXiv Detail & Related papers (2026-01-30T14:21:46Z) - Lameness detection in dairy cows using pose estimation and bidirectional LSTMs [0.0]
This study presents a lameness detection approach that combines pose estimation and Bidirectional Long-Short-Term Memory (BLSTM) neural networks.<n>Our method significantly outperformed an established method that relied on manually-designed features.
arXiv Detail & Related papers (2025-08-14T13:38:48Z) - AHDMIL: Asymmetric Hierarchical Distillation Multi-Instance Learning for Fast and Accurate Whole-Slide Image Classification [51.525891360380285]
AHDMIL is an Asymmetric Hierarchical Distillation Multi-Instance Learning framework.<n>It eliminates irrelevant patches through a two-step training process.<n>It consistently outperforms previous state-of-the-art methods in both classification performance and inference speed.
arXiv Detail & Related papers (2025-08-07T07:47:16Z) - Multi-Modal Graph Convolutional Network with Sinusoidal Encoding for Robust Human Action Segmentation [10.122882293302787]
temporal segmentation of human actions is critical for intelligent robots in collaborative settings.<n>We propose a Multi-Modal Graph Convolutional Network (MMGCN) that integrates low-frame-rate (e.g., 1 fps) visual data with high-frame-rate (e.g., 30 fps) motion data.<n>Our approach outperforms state-of-the-art methods, especially in action segmentation accuracy.
arXiv Detail & Related papers (2025-07-01T13:55:57Z) - WaveFormer: A Lightweight Transformer Model for sEMG-based Gesture Recognition [18.978031999678507]
WaveFormer is a lightweight transformer-based architecture tailored for sEMG gesture recognition.<n>Our model integrates time-domain and frequency-domain features through a novel learnable wavelet transform, enhancing feature extraction.<n>With just 3.1 million parameters, WaveFormer achieves 95% classification accuracy on the EPN612 dataset, outperforming larger models.
arXiv Detail & Related papers (2025-06-12T04:07:11Z) - SChanger: Change Detection from a Semantic Change and Spatial Consistency Perspective [0.6749750044497732]
We develop a fine-tuning strategy called the Semantic Change Network (SCN) to address the data scarcity issue.
We observe that the locations of changes between the two images are spatially identical, a concept we refer to as spatial consistency.
This enhances the modeling of multi-scale changes and helps capture underlying relationships in change detection semantics.
arXiv Detail & Related papers (2025-03-26T17:15:43Z) - Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation [27.206656215734295]
We propose a novel Decomposed Vector-Quantized Variational Autoencoder (DVQ-VAE) to generate realistic human grasps.
Part-aware decomposed architecture facilitates more precise management of the interaction between each component of hand and object.
Our model achieved about 14.1% relative improvement in the quality index compared to the state-of-the-art methods in four widely-adopted benchmarks.
arXiv Detail & Related papers (2024-07-19T06:41:16Z) - Coordinate Transformer: Achieving Single-stage Multi-person Mesh
Recovery from Videos [91.44553585470688]
Multi-person 3D mesh recovery from videos is a critical first step towards automatic perception of group behavior in virtual reality, physical therapy and beyond.
We propose the Coordinate transFormer (CoordFormer) that directly models multi-person spatial-temporal relations and simultaneously performs multi-mesh recovery in an end-to-end manner.
Experiments on the 3DPW dataset demonstrate that CoordFormer significantly improves the state-of-the-art, outperforming the previously best results by 4.2%, 8.8% and 4.7% according to the MPJPE, PAMPJPE, and PVE metrics, respectively.
arXiv Detail & Related papers (2023-08-20T18:23:07Z) - Self-Attentive Pooling for Efficient Deep Learning [6.822466048176652]
We propose a novel non-local self-attentive pooling method that can be used as a drop-in replacement to the standard pooling layers.
We surpass the test accuracy of existing pooling techniques on different variants of MobileNet-V2 on ImageNet by an average of 1.2%.
Our approach achieves 1.43% higher test accuracy compared to SOTA techniques with iso-memory footprints.
arXiv Detail & Related papers (2022-09-16T00:35:14Z) - Voxelmorph++ Going beyond the cranial vault with keypoint supervision
and multi-channel instance optimisation [8.88841928746097]
Recent Learn2Reg benchmark shows single-scale U-Net architectures fall short of state-of-the-art performance for abdominal or intra-patient lung registration.
Here, we propose two straightforward steps that greatly reduce this gap in accuracy.
First, we employ keypoint self-supervision with a novel network head that predicts a discretised heatmap.
Second, we replace multiple learned fine-tuning steps by a single instance with hand-crafted features and the Adam optimiser.
arXiv Detail & Related papers (2022-02-28T19:23:29Z) - ViT-HGR: Vision Transformer-based Hand Gesture Recognition from High
Density Surface EMG Signals [14.419091034872682]
We investigate and design a Vision Transformer (ViT) based architecture to perform hand gesture recognition from High Density (HD-sEMG) signals.
The proposed ViT-HGR framework can overcome the training time problems and can accurately classify a large number of hand gestures from scratch.
Our experiments with 64-sample (31.25 ms) window size yield average test accuracy of 84.62 +/- 3.07%, where only 78, 210 number of parameters is utilized.
arXiv Detail & Related papers (2022-01-25T02:42:50Z) - Unsupervised Motion Representation Learning with Capsule Autoencoders [54.81628825371412]
Motion Capsule Autoencoder (MCAE) models motion in a two-level hierarchy.
MCAE is evaluated on a novel Trajectory20 motion dataset and various real-world skeleton-based human action datasets.
arXiv Detail & Related papers (2021-10-01T16:52:03Z) - Automatic size and pose homogenization with spatial transformer network
to improve and accelerate pediatric segmentation [51.916106055115755]
We propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN)
Our architecture is composed of three sequential modules that are estimated together during training.
We test the proposed method in kidney and renal tumor segmentation on abdominal pediatric CT scanners.
arXiv Detail & Related papers (2021-07-06T14:50:03Z) - When Liebig's Barrel Meets Facial Landmark Detection: A Practical Model [87.25037167380522]
We propose a model that is accurate, robust, efficient, generalizable, and end-to-end trainable.
In order to achieve a better accuracy, we propose two lightweight modules.
DQInit dynamically initializes the queries of decoder from the inputs, enabling the model to achieve as good accuracy as the ones with multiple decoder layers.
QAMem is designed to enhance the discriminative ability of queries on low-resolution feature maps by assigning separate memory values to each query rather than a shared one.
arXiv Detail & Related papers (2021-05-27T13:51:42Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z) - Deep Soft Procrustes for Markerless Volumetric Sensor Alignment [81.13055566952221]
In this work, we improve markerless data-driven correspondence estimation to achieve more robust multi-sensor spatial alignment.
We incorporate geometric constraints in an end-to-end manner into a typical segmentation based model and bridge the intermediate dense classification task with the targeted pose estimation one.
Our model is experimentally shown to achieve similar results with marker-based methods and outperform the markerless ones, while also being robust to the pose variations of the calibration structure.
arXiv Detail & Related papers (2020-03-23T10:51:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.