Related papers: Transformer-based Hand Gesture Recognition via High-Density EMG Signals: From Instantaneous Recognition to Fusion of Motor Unit Spike Trains

Transformer-based Hand Gesture Recognition via High-Density EMG Signals: From Instantaneous Recognition to Fusion of Motor Unit Spike Trains

URL: http://arxiv.org/abs/2212.00743v1
Date: Tue, 29 Nov 2022 23:32:08 GMT
Title: Transformer-based Hand Gesture Recognition via High-Density EMG Signals: From Instantaneous Recognition to Fusion of Motor Unit Spike Trains
Authors: Mansooreh Montazerin, Elahe Rahimian, Farnoosh Naderkhani, S. Farokh Atashzar, Svetlana Yanushkevich, Arash Mohammadi
Abstract summary: The paper proposes a compact deep learning framework referred to as the CT-HGR, which employs a vision transformer network to conduct hand gesture recognition. CT-HGR can be trained from scratch without any need for transfer learning and can simultaneously extract both temporal and spatial features of HD-sEMG data. The framework achieves accuracy of 89.13% for instantaneous recognition based on a single frame of HD-sEMG image.
Score: 11.443553761853856
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Designing efficient and labor-saving prosthetic hands requires powerful hand gesture recognition algorithms that can achieve high accuracy with limited complexity and latency. In this context, the paper proposes a compact deep learning framework referred to as the CT-HGR, which employs a vision transformer network to conduct hand gesture recognition using highdensity sEMG (HD-sEMG) signals. The attention mechanism in the proposed model identifies similarities among different data segments with a greater capacity for parallel computations and addresses the memory limitation problems while dealing with inputs of large sequence lengths. CT-HGR can be trained from scratch without any need for transfer learning and can simultaneously extract both temporal and spatial features of HD-sEMG data. Additionally, the CT-HGR framework can perform instantaneous recognition using sEMG image spatially composed from HD-sEMG signals. A variant of the CT-HGR is also designed to incorporate microscopic neural drive information in the form of Motor Unit Spike Trains (MUSTs) extracted from HD-sEMG signals using Blind Source Separation (BSS). This variant is combined with its baseline version via a hybrid architecture to evaluate potentials of fusing macroscopic and microscopic neural drive information. The utilized HD-sEMG dataset involves 128 electrodes that collect the signals related to 65 isometric hand gestures of 20 subjects. The proposed CT-HGR framework is applied to 31.25, 62.5, 125, 250 ms window sizes of the above-mentioned dataset utilizing 32, 64, 128 electrode channels. The average accuracy over all the participants using 32 electrodes and a window size of 31.25 ms is 86.23%, which gradually increases till reaching 91.98% for 128 electrodes and a window size of 250 ms. The CT-HGR achieves accuracy of 89.13% for instantaneous recognition based on a single frame of HD-sEMG image.

Related papers

BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals [50.76802709706976]
This paper proposes Brain Omni, the first brain foundation model that generalises across heterogeneous EEG and MEG recordings.<n>To unify diverse data sources, we introduce BrainTokenizer, the first tokenizer that quantises neural brain activity into discrete representations.<n>A total of 1,997 hours of EEG and 656 hours of MEG data are curated and standardised from publicly available sources for pretraining.
arXiv Detail & Related papers (2025-05-18T14:07:14Z)
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images [43.65650710265957]
We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation. GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations. We propose the Grounded ECG task, a clinically motivated benchmark designed to assess the MLLM's capability in grounded ECG understanding.
arXiv Detail & Related papers (2025-03-08T05:48:53Z)
CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals. Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality. The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z)
emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography [47.160223334501126]
emg2qwerty is a large-scale dataset of non-invasive electromyographic signals recorded at the wrists while touch typing on a QWERTY keyboard. With 1,135 sessions spanning 108 users and 346 hours of recording, this is the largest such public dataset to date. We show strong baseline performance on predicting key-presses using sEMG signals alone.
arXiv Detail & Related papers (2024-10-26T05:18:48Z)
Learning Brain Tumor Representation in 3D High-Resolution MR Images via Interpretable State Space Models [42.55786269051626]
We propose a novel state-space-model (SSM)-based masked autoencoder which scales ViT-like models to handle high-resolution data effectively. We propose a latent-to-spatial mapping technique that enables direct visualization of how latent features correspond to specific regions in the input volumes. Our results highlight the potential of SSM-based self-supervised learning to transform radiomics analysis by combining efficiency and interpretability.
arXiv Detail & Related papers (2024-09-12T04:36:50Z)
Multi-view Hybrid Graph Convolutional Network for Volume-to-mesh Reconstruction in Cardiovascular MRI [43.47826598981827]
We introduce HybridVNet, a novel architecture for direct image-to-mesh extraction. We show it can efficiently handle surface and volumetric meshes by encoding them as graph structures. Our model combines traditional convolutional networks with variational graph generative models, deep supervision and mesh-specific regularisation.
arXiv Detail & Related papers (2023-11-22T21:51:29Z)
From Unimodal to Multimodal: improving sEMG-Based Pattern Recognition via deep generative models [1.1477981286485912]
Multimodal hand gesture recognition (HGR) systems can achieve higher recognition accuracy compared to unimodal HGR systems. This paper proposes a novel generative approach to improve Surface Electromyography (sEMG)-based HGR accuracy via virtual Inertial Measurement Unit (IMU) signals.
arXiv Detail & Related papers (2023-08-08T07:15:23Z)
Breast Ultrasound Tumor Classification Using a Hybrid Multitask CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification. Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations. In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z)
Brain Imaging-to-Graph Generation using Adversarial Hierarchical Diffusion Models for MCI Causality Analysis [44.45598796591008]
Brain imaging-to-graph generation (BIGG) framework is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment analysis. The hierarchical transformers in the generator are designed to estimate the noise at multiple scales. Evaluations of the ADNI dataset demonstrate the feasibility and efficacy of the proposed model.
arXiv Detail & Related papers (2023-05-18T06:54:56Z)
HYDRA-HGR: A Hybrid Transformer-based Architecture for Fusion of Macroscopic and Microscopic Neural Drive Information [11.443553761853856]
We propose a hybrid model that simultaneously extracts a set of temporal and spatial features at microscopic level. The proposed HYDRA-HGR framework achieves average accuracy of 94.86% for the 250 ms window size, which is 5.52% and 8.22% higher than that of the Macro and Micro paths, respectively.
arXiv Detail & Related papers (2022-10-27T02:23:27Z)
Light-weighted CNN-Attention based architecture for Hand Gesture Recognition via ElectroMyography [19.51045409936039]
We propose a light-weighted hybrid architecture (HDCAM) based on Convolutional Neural Network (CNN) and attention mechanism. The proposed HDCAM model with 58,441 parameters reached a new state-of-the-art (SOTA) performance with 82.91% and 81.28% accuracy on window sizes of 300 ms and 200 ms for classifying 17 hand gestures.
arXiv Detail & Related papers (2022-10-27T02:12:07Z)
ViT-HGR: Vision Transformer-based Hand Gesture Recognition from High Density Surface EMG Signals [14.419091034872682]
We investigate and design a Vision Transformer (ViT) based architecture to perform hand gesture recognition from High Density (HD-sEMG) signals. The proposed ViT-HGR framework can overcome the training time problems and can accurately classify a large number of hand gestures from scratch. Our experiments with 64-sample (31.25 ms) window size yield average test accuracy of 84.62 +/- 3.07%, where only 78, 210 number of parameters is utilized.
arXiv Detail & Related papers (2022-01-25T02:42:50Z)
Hand Gesture Recognition Using Temporal Convolutions and Attention Mechanism [16.399230849853915]
We propose the novel Temporal Convolutions-based Hand Gesture Recognition architecture (TC-HGR) to reduce this computational burden. We classified 17 hand gestures via surface Electromyogram (sEMG) signals by the adoption of attention mechanisms and temporal convolutions. The proposed method led to 81.65% and 80.72% classification accuracy for window sizes of 300ms and 200ms, respectively.
arXiv Detail & Related papers (2021-10-17T04:23:59Z)
High speed microcircuit and synthetic biosignal widefield imaging using nitrogen vacancies in diamond [44.62475518267084]
We show how to image signals from a microscopic lithographically patterned circuit at the micrometer scale. Using a new type of lock-in amplifier camera, we demonstrate sub-millisecond spatially resolved recovery of AC and pulsed electrical current signals. Finally, we demonstrate as a proof of principle the recovery of synthetic signals replicating the exact form of signals in a biological neural network.
arXiv Detail & Related papers (2021-07-29T16:27:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.