Multimodal Fusion of EMG and Vision for Human Grasp Intent Inference in
Prosthetic Hand Control
- URL: http://arxiv.org/abs/2104.03893v5
- Date: Tue, 27 Feb 2024 22:49:26 GMT
- Title: Multimodal Fusion of EMG and Vision for Human Grasp Intent Inference in
Prosthetic Hand Control
- Authors: Mehrshad Zandigohar, Mo Han, Mohammadreza Sharif, Sezen Yagmur Gunay,
Mariusz P. Furmanek, Mathew Yarossi, Paolo Bonato, Cagdas Onal, Taskin Padir,
Deniz Erdogmus, Gunar Schirner
- Abstract summary: We present a Bayesian evidence fusion framework for grasp intent inference using eye-view video, eye-gaze, and EMG from the forearm.
We analyze individual and fused performance as a function of time as the hand approaches the object to grasp it.
- Score: 11.400385533782204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Objective: For transradial amputees, robotic prosthetic hands promise to
regain the capability to perform daily living activities. Current control
methods based on physiological signals such as electromyography (EMG) are prone
to yielding poor inference outcomes due to motion artifacts, muscle fatigue,
and many more. Vision sensors are a major source of information about the
environment state and can play a vital role in inferring feasible and intended
gestures. However, visual evidence is also susceptible to its own artifacts,
most often due to object occlusion, lighting changes, etc. Multimodal evidence
fusion using physiological and vision sensor measurements is a natural approach
due to the complementary strengths of these modalities. Methods: In this paper,
we present a Bayesian evidence fusion framework for grasp intent inference
using eye-view video, eye-gaze, and EMG from the forearm processed by neural
network models. We analyze individual and fused performance as a function of
time as the hand approaches the object to grasp it. For this purpose, we have
also developed novel data processing and augmentation techniques to train
neural network components. Results: Our results indicate that, on average,
fusion improves the instantaneous upcoming grasp type classification accuracy
while in the reaching phase by 13.66% and 14.8%, relative to EMG (81.64%
non-fused) and visual evidence (80.5% non-fused) individually, resulting in an
overall fusion accuracy of 95.3%. Conclusion: Our experimental data analyses
demonstrate that EMG and visual evidence show complementary strengths, and as a
consequence, fusion of multimodal evidence can outperform each individual
evidence modality at any given time.
Related papers
- Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors [63.194053817609024]
We introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition dataset.
For the first time, we provide annotations for both Emotion Recognition (ER) and Facial Expression Recognition (FER) in the EMER dataset.
We specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER.
arXiv Detail & Related papers (2024-11-08T04:53:55Z) - Scaling Wearable Foundation Models [54.93979158708164]
We investigate the scaling properties of sensor foundation models across compute, data, and model size.
Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM.
Our results establish the scaling laws of LSM for tasks such as imputation, extrapolation, both across time and sensor modalities.
arXiv Detail & Related papers (2024-10-17T15:08:21Z) - Active inference and deep generative modeling for cognitive ultrasound [20.383444113659476]
We show that US imaging systems can be recast as information-seeking agents that engage in reciprocal interactions with their anatomical environment.
Such agents autonomously adapt their transmit-receive sequences to fully personalize imaging and actively maximize information gain in-situ.
We then equip systems with a mechanism to actively reduce uncertainty and maximize diagnostic value across a sequence of experiments.
arXiv Detail & Related papers (2024-10-17T08:09:14Z) - Remote Bio-Sensing: Open Source Benchmark Framework for Fair Evaluation
of rPPG [2.82697733014759]
r (pg photoplethysmography) is a technology that measures and analyzes BVP (Blood Volume Pulse) by using the light absorption characteristics of hemoglobin captured through a camera.
This study is to provide a framework to evaluate various r benchmarking techniques across a wide range of datasets for fair evaluation and comparison.
arXiv Detail & Related papers (2023-07-24T09:35:47Z) - A Deep Learning Approach for the Segmentation of Electroencephalography
Data in Eye Tracking Applications [56.458448869572294]
We introduce DETRtime, a novel framework for time-series segmentation of EEG data.
Our end-to-end deep learning-based framework brings advances in Computer Vision to the forefront.
Our model generalizes well in the task of EEG sleep stage segmentation.
arXiv Detail & Related papers (2022-06-17T10:17:24Z) - Cross-Modality Neuroimage Synthesis: A Survey [71.27193056354741]
Multi-modality imaging improves disease diagnosis and reveals distinct deviations in tissues with anatomical properties.
The existence of completely aligned and paired multi-modality neuroimaging data has proved its effectiveness in brain research.
An alternative solution is to explore unsupervised or weakly supervised learning methods to synthesize the absent neuroimaging data.
arXiv Detail & Related papers (2022-02-14T19:29:08Z) - Continuous Decoding of Daily-Life Hand Movements from Forearm Muscle
Activity for Enhanced Myoelectric Control of Hand Prostheses [78.120734120667]
We introduce a novel method, based on a long short-term memory (LSTM) network, to continuously map forearm EMG activity onto hand kinematics.
Ours is the first reported work on the prediction of hand kinematics that uses this challenging dataset.
Our results suggest that the presented method is suitable for the generation of control signals for the independent and proportional actuation of the multiple DOFs of state-of-the-art hand prostheses.
arXiv Detail & Related papers (2021-04-29T00:11:32Z) - From Hand-Perspective Visual Information to Grasp Type Probabilities:
Deep Learning via Ranking Labels [6.772076545800592]
We build a novel probabilistic classifier according to the Plackett-Luce model to predict the probability distribution over grasps.
We indicate that the proposed model is applicable to the most popular and productive convolutional neural network frameworks.
arXiv Detail & Related papers (2021-03-08T16:12:38Z) - HANDS: A Multimodal Dataset for Modeling Towards Human Grasp Intent
Inference in Prosthetic Hands [3.7886097009023376]
Advanced prosthetic hands of the future are anticipated to benefit from improved shared control between a robotic hand and its human user.
multimodal sensor data may include various environment sensors including vision, as well as human physiology and behavior sensors.
A fusion methodology for environmental state and human intent estimation can combine these sources of evidence in order to help prosthetic hand motion planning and control.
arXiv Detail & Related papers (2021-03-08T15:51:03Z) - Towards Creating a Deployable Grasp Type Probability Estimator for a
Prosthetic Hand [11.008123712007402]
InceptionV3 achieves highest accuracy with 0.95 angular similarity followed by 1.4 MobileNetV2 with 0.93 at 20% the amount of operations.
Our work enables augmenting EMG intent inference with physical state probability through machine learning and computer vision method.
arXiv Detail & Related papers (2021-01-13T21:39:41Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.