From Hand-Perspective Visual Information to Grasp Type Probabilities:
Deep Learning via Ranking Labels
- URL: http://arxiv.org/abs/2103.04863v1
- Date: Mon, 8 Mar 2021 16:12:38 GMT
- Title: From Hand-Perspective Visual Information to Grasp Type Probabilities:
Deep Learning via Ranking Labels
- Authors: Mo Han, Sezen Ya{\u{g}}mur G\"unay, \.Ilkay Y{\i}ld{\i}z, Paolo
Bonato, Cagdas D. Onal, Ta\c{s}k{\i}n Pad{\i}r, Gunar Schirner, Deniz
Erdo{\u{g}}mu\c{s}
- Abstract summary: We build a novel probabilistic classifier according to the Plackett-Luce model to predict the probability distribution over grasps.
We indicate that the proposed model is applicable to the most popular and productive convolutional neural network frameworks.
- Score: 6.772076545800592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Limb deficiency severely affects the daily lives of amputees and drives
efforts to provide functional robotic prosthetic hands to compensate this
deprivation. Convolutional neural network-based computer vision control of the
prosthetic hand has received increased attention as a method to replace or
complement physiological signals due to its reliability by training visual
information to predict the hand gesture. Mounting a camera into the palm of a
prosthetic hand is proved to be a promising approach to collect visual data.
However, the grasp type labelled from the eye and hand perspective may differ
as object shapes are not always symmetric. Thus, to represent this difference
in a realistic way, we employed a dataset containing synchronous images from
eye- and hand- view, where the hand-perspective images are used for training
while the eye-view images are only for manual labelling. Electromyogram (EMG)
activity and movement kinematics data from the upper arm are also collected for
multi-modal information fusion in future work. Moreover, in order to include
human-in-the-loop control and combine the computer vision with physiological
signal inputs, instead of making absolute positive or negative predictions, we
build a novel probabilistic classifier according to the Plackett-Luce model. To
predict the probability distribution over grasps, we exploit the statistical
model over label rankings to solve the permutation domain problems via a
maximum likelihood estimation, utilizing the manually ranked lists of grasps as
a new form of label. We indicate that the proposed model is applicable to the
most popular and productive convolutional neural network frameworks.
Related papers
- Predicting Stroke through Retinal Graphs and Multimodal Self-supervised Learning [0.46835339362676565]
Early identification of stroke is crucial for intervention, requiring reliable models.
We proposed an efficient retinal image representation together with clinical information to capture a comprehensive overview of cardiovascular health.
arXiv Detail & Related papers (2024-11-08T14:40:56Z) - Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - Transductive Linear Probing: A Novel Framework for Few-Shot Node
Classification [56.17097897754628]
We show that transductive linear probing with self-supervised graph contrastive pretraining can outperform the state-of-the-art fully supervised meta-learning based methods under the same protocol.
We hope this work can shed new light on few-shot node classification problems and foster future research on learning from scarcely labeled instances on graphs.
arXiv Detail & Related papers (2022-12-11T21:10:34Z) - Graph Neural Networks with Trainable Adjacency Matrices for Fault
Diagnosis on Multivariate Sensor Data [69.25738064847175]
It is necessary to consider the behavior of the signals in each sensor separately, to take into account their correlation and hidden relationships with each other.
The graph nodes can be represented as data from the different sensors, and the edges can display the influence of these data on each other.
It was proposed to construct a graph during the training of graph neural network. This allows to train models on data where the dependencies between the sensors are not known in advance.
arXiv Detail & Related papers (2022-10-20T11:03:21Z) - Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared
Control on the Hannes Prosthesis [6.517935794312337]
We present an eye-in-hand learning-based approach for hand pre-shape classification from RGB sequences.
We tackle the peculiarity of the eye-in-hand setting by means of a model for the human arm trajectories.
arXiv Detail & Related papers (2022-03-18T09:16:48Z) - Visual Distant Supervision for Scene Graph Generation [66.10579690929623]
Scene graph models usually require supervised learning on large quantities of labeled data with intensive human annotation.
We propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.
Comprehensive experimental results show that our distantly supervised model outperforms strong weakly supervised and semi-supervised baselines.
arXiv Detail & Related papers (2021-03-29T06:35:24Z) - HANDS: A Multimodal Dataset for Modeling Towards Human Grasp Intent
Inference in Prosthetic Hands [3.7886097009023376]
Advanced prosthetic hands of the future are anticipated to benefit from improved shared control between a robotic hand and its human user.
multimodal sensor data may include various environment sensors including vision, as well as human physiology and behavior sensors.
A fusion methodology for environmental state and human intent estimation can combine these sources of evidence in order to help prosthetic hand motion planning and control.
arXiv Detail & Related papers (2021-03-08T15:51:03Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Towards Creating a Deployable Grasp Type Probability Estimator for a
Prosthetic Hand [11.008123712007402]
InceptionV3 achieves highest accuracy with 0.95 angular similarity followed by 1.4 MobileNetV2 with 0.93 at 20% the amount of operations.
Our work enables augmenting EMG intent inference with physical state probability through machine learning and computer vision method.
arXiv Detail & Related papers (2021-01-13T21:39:41Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.