Related papers: From Hand-Perspective Visual Information to Grasp Type Probabilities: Deep Learning via Ranking Labels

From Hand-Perspective Visual Information to Grasp Type Probabilities: Deep Learning via Ranking Labels

URL: http://arxiv.org/abs/2103.04863v1
Date: Mon, 8 Mar 2021 16:12:38 GMT
Title: From Hand-Perspective Visual Information to Grasp Type Probabilities: Deep Learning via Ranking Labels
Authors: Mo Han, Sezen Ya{\u{g}}mur G\"unay, \.Ilkay Y{\i}ld{\i}z, Paolo Bonato, Cagdas D. Onal, Ta\c{s}k{\i}n Pad{\i}r, Gunar Schirner, Deniz Erdo{\u{g}}mu\c{s}
Abstract summary: We build a novel probabilistic classifier according to the Plackett-Luce model to predict the probability distribution over grasps. We indicate that the proposed model is applicable to the most popular and productive convolutional neural network frameworks.
Score: 6.772076545800592
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Limb deficiency severely affects the daily lives of amputees and drives efforts to provide functional robotic prosthetic hands to compensate this deprivation. Convolutional neural network-based computer vision control of the prosthetic hand has received increased attention as a method to replace or complement physiological signals due to its reliability by training visual information to predict the hand gesture. Mounting a camera into the palm of a prosthetic hand is proved to be a promising approach to collect visual data. However, the grasp type labelled from the eye and hand perspective may differ as object shapes are not always symmetric. Thus, to represent this difference in a realistic way, we employed a dataset containing synchronous images from eye- and hand- view, where the hand-perspective images are used for training while the eye-view images are only for manual labelling. Electromyogram (EMG) activity and movement kinematics data from the upper arm are also collected for multi-modal information fusion in future work. Moreover, in order to include human-in-the-loop control and combine the computer vision with physiological signal inputs, instead of making absolute positive or negative predictions, we build a novel probabilistic classifier according to the Plackett-Luce model. To predict the probability distribution over grasps, we exploit the statistical model over label rankings to solve the permutation domain problems via a maximum likelihood estimation, utilizing the manually ranked lists of grasps as a new form of label. We indicate that the proposed model is applicable to the most popular and productive convolutional neural network frameworks.

Related papers

Gaze-Guided 3D Hand Motion Prediction for Detecting Intent in Egocentric Grasping Tasks [5.018156030818883]
We propose a novel approach that predicts future sequences of both hand poses and joint positions. We use a vector-quantized variational autoencoder for robust hand pose encoding with an autoregressive generative transformer for effective hand motion sequence prediction.
arXiv Detail & Related papers (2025-03-27T15:26:41Z)
Predicting Stroke through Retinal Graphs and Multimodal Self-supervised Learning [0.46835339362676565]
Early identification of stroke is crucial for intervention, requiring reliable models. We proposed an efficient retinal image representation together with clinical information to capture a comprehensive overview of cardiovascular health.
arXiv Detail & Related papers (2024-11-08T14:40:56Z)
Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration. In this work, we tackle the task of reconstructing closely interactive humans from a monocular video. We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z)
Transductive Linear Probing: A Novel Framework for Few-Shot Node Classification [56.17097897754628]
We show that transductive linear probing with self-supervised graph contrastive pretraining can outperform the state-of-the-art fully supervised meta-learning based methods under the same protocol. We hope this work can shed new light on few-shot node classification problems and foster future research on learning from scarcely labeled instances on graphs.
arXiv Detail & Related papers (2022-12-11T21:10:34Z)
Graph Neural Networks with Trainable Adjacency Matrices for Fault Diagnosis on Multivariate Sensor Data [69.25738064847175]
It is necessary to consider the behavior of the signals in each sensor separately, to take into account their correlation and hidden relationships with each other. The graph nodes can be represented as data from the different sensors, and the edges can display the influence of these data on each other. It was proposed to construct a graph during the training of graph neural network. This allows to train models on data where the dependencies between the sensors are not known in advance.
arXiv Detail & Related papers (2022-10-20T11:03:21Z)
Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared Control on the Hannes Prosthesis [6.517935794312337]
We present an eye-in-hand learning-based approach for hand pre-shape classification from RGB sequences. We tackle the peculiarity of the eye-in-hand setting by means of a model for the human arm trajectories.
arXiv Detail & Related papers (2022-03-18T09:16:48Z)
Visual Distant Supervision for Scene Graph Generation [66.10579690929623]
Scene graph models usually require supervised learning on large quantities of labeled data with intensive human annotation. We propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data. Comprehensive experimental results show that our distantly supervised model outperforms strong weakly supervised and semi-supervised baselines.
arXiv Detail & Related papers (2021-03-29T06:35:24Z)
HANDS: A Multimodal Dataset for Modeling Towards Human Grasp Intent Inference in Prosthetic Hands [3.7886097009023376]
Advanced prosthetic hands of the future are anticipated to benefit from improved shared control between a robotic hand and its human user. multimodal sensor data may include various environment sensors including vision, as well as human physiology and behavior sensors. A fusion methodology for environmental state and human intent estimation can combine these sources of evidence in order to help prosthetic hand motion planning and control.
arXiv Detail & Related papers (2021-03-08T15:51:03Z)
Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning. It aims to extract both the common information and the complementary information in an adversarial setting. In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z)
Towards Creating a Deployable Grasp Type Probability Estimator for a Prosthetic Hand [11.008123712007402]
InceptionV3 achieves highest accuracy with 0.95 angular similarity followed by 1.4 MobileNetV2 with 0.93 at 20% the amount of operations. Our work enables augmenting EMG intent inference with physical state probability through machine learning and computer vision method.
arXiv Detail & Related papers (2021-01-13T21:39:41Z)
Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI) PPI leverages proactive interventions to guard against image features with no causal relevance. We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.