Fast and Expressive Gesture Recognition using a Combination-Homomorphic
Electromyogram Encoder
- URL: http://arxiv.org/abs/2311.14675v2
- Date: Wed, 29 Nov 2023 16:19:16 GMT
- Title: Fast and Expressive Gesture Recognition using a Combination-Homomorphic
Electromyogram Encoder
- Authors: Niklas Smedemark-Margulies, Yunus Bicer, Elifnur Sunger, Tales
Imbiriba, Eugene Tunik, Deniz Erdogmus, Mathew Yarossi, Robin Walters
- Abstract summary: We study the task of gesture recognition from electromyography (EMG)
We define combination gestures consisting of a direction component and a modifier component.
New subjects only demonstrate the single component gestures.
We extrapolate to unseen combination gestures by combining the feature vectors of real single gestures to produce synthetic training data.
- Score: 21.25126610043744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the task of gesture recognition from electromyography (EMG), with
the goal of enabling expressive human-computer interaction at high accuracy,
while minimizing the time required for new subjects to provide calibration
data. To fulfill these goals, we define combination gestures consisting of a
direction component and a modifier component. New subjects only demonstrate the
single component gestures and we seek to extrapolate from these to all possible
single or combination gestures. We extrapolate to unseen combination gestures
by combining the feature vectors of real single gestures to produce synthetic
training data. This strategy allows us to provide a large and flexible gesture
vocabulary, while not requiring new subjects to demonstrate combinatorially
many example gestures. We pre-train an encoder and a combination operator using
self-supervision, so that we can produce useful synthetic training data for
unseen test subjects. To evaluate the proposed method, we collect a real-world
EMG dataset, and measure the effect of augmented supervision against two
baselines: a partially-supervised model trained with only single gesture data
from the unseen subject, and a fully-supervised model trained with real single
and real combination gesture data from the unseen subject. We find that the
proposed method provides a dramatic improvement over the partially-supervised
model, and achieves a useful classification accuracy that in some cases
approaches the performance of the fully-supervised model.
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation [27.206656215734295]
We propose a novel Decomposed Vector-Quantized Variational Autoencoder (DVQ-VAE) to generate realistic human grasps.
Part-aware decomposed architecture facilitates more precise management of the interaction between each component of hand and object.
Our model achieved about 14.1% relative improvement in the quality index compared to the state-of-the-art methods in four widely-adopted benchmarks.
arXiv Detail & Related papers (2024-07-19T06:41:16Z) - Lightweight Unsupervised Federated Learning with Pretrained Vision Language Model [32.094290282897894]
Federated learning aims to train a collective model from physically isolated clients while safeguarding the privacy of users' data.
We propose a novel lightweight unsupervised federated learning approach that leverages unlabeled data on each client to perform lightweight model training and communication.
Our proposed method greatly enhances model performance in comparison to CLIP's zero-shot predictions and even outperforms supervised federated learning benchmark methods.
arXiv Detail & Related papers (2024-04-17T03:42:48Z) - Towards a Unified Transformer-based Framework for Scene Graph Generation
and Human-object Interaction Detection [116.21529970404653]
We introduce SG2HOI+, a unified one-step model based on the Transformer architecture.
Our approach employs two interactive hierarchical Transformers to seamlessly unify the tasks of SGG and HOI detection.
Our approach achieves competitive performance when compared to state-of-the-art HOI methods.
arXiv Detail & Related papers (2023-11-03T07:25:57Z) - A Multi-label Classification Approach to Increase Expressivity of
EMG-based Gesture Recognition [4.701158597171363]
The aim of this study is to efficiently increase the expressivity of surface electromyography-based (sEMG) gesture recognition systems.
We use a problem transformation approach, in which actions were subset into two biomechanically independent components.
arXiv Detail & Related papers (2023-09-13T20:21:41Z) - Efficient Adaptive Human-Object Interaction Detection with
Concept-guided Memory [64.11870454160614]
We propose an efficient Adaptive HOI Detector with Concept-guided Memory (ADA-CM)
ADA-CM has two operating modes. The first mode makes it tunable without learning new parameters in a training-free paradigm.
Our proposed method achieves competitive results with state-of-the-art on the HICO-DET and V-COCO datasets with much less training time.
arXiv Detail & Related papers (2023-09-07T13:10:06Z) - A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search.
We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z) - Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared
Control on the Hannes Prosthesis [6.517935794312337]
We present an eye-in-hand learning-based approach for hand pre-shape classification from RGB sequences.
We tackle the peculiarity of the eye-in-hand setting by means of a model for the human arm trajectories.
arXiv Detail & Related papers (2022-03-18T09:16:48Z) - Partner-Assisted Learning for Few-Shot Image Classification [54.66864961784989]
Few-shot Learning has been studied to mimic human visual capabilities and learn effective models without the need of exhaustive human annotation.
In this paper, we focus on the design of training strategy to obtain an elemental representation such that the prototype of each novel class can be estimated from a few labeled samples.
We propose a two-stage training scheme, which first trains a partner encoder to model pair-wise similarities and extract features serving as soft-anchors, and then trains a main encoder by aligning its outputs with soft-anchors while attempting to maximize classification performance.
arXiv Detail & Related papers (2021-09-15T22:46:19Z) - Real-time Pose and Shape Reconstruction of Two Interacting Hands With a
Single Depth Camera [79.41374930171469]
We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands.
Our approach combines an extensive list of favorable properties, namely it is marker-less.
We show state-of-the-art results in scenes that exceed the complexity level demonstrated by previous work.
arXiv Detail & Related papers (2021-06-15T11:39:49Z) - Function Contrastive Learning of Transferable Meta-Representations [38.31692245188669]
We study the implications of joint training on the transferability of the meta-representations.
We propose a decoupled encoder-decoder approach to supervised meta-learning.
arXiv Detail & Related papers (2020-10-14T13:50:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.