Long-Distance Gesture Recognition using Dynamic Neural Networks
- URL: http://arxiv.org/abs/2308.04643v1
- Date: Wed, 9 Aug 2023 00:56:38 GMT
- Title: Long-Distance Gesture Recognition using Dynamic Neural Networks
- Authors: Shubhang Bhatnagar, Sharath Gopal, Narendra Ahuja, Liu Ren
- Abstract summary: We propose a novel, accurate and efficient method for the recognition of gestures from longer distances.
It uses a dynamic neural network to select features from gesture-containing spatial regions of the input sensor data for further processing.
We demonstrate the performance of our method on the LD-ConGR long-distance dataset.
- Score: 14.106548659369716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gestures form an important medium of communication between humans and
machines. An overwhelming majority of existing gesture recognition methods are
tailored to a scenario where humans and machines are located very close to each
other. This short-distance assumption does not hold true for several types of
interactions, for example gesture-based interactions with a floor cleaning
robot or with a drone. Methods made for short-distance recognition are unable
to perform well on long-distance recognition due to gestures occupying only a
small portion of the input data. Their performance is especially worse in
resource constrained settings where they are not able to effectively focus
their limited compute on the gesturing subject. We propose a novel, accurate
and efficient method for the recognition of gestures from longer distances. It
uses a dynamic neural network to select features from gesture-containing
spatial regions of the input sensor data for further processing. This helps the
network focus on features important for gesture recognition while discarding
background features early on, thus making it more compute efficient compared to
other techniques. We demonstrate the performance of our method on the LD-ConGR
long-distance dataset where it outperforms previous state-of-the-art methods on
recognition accuracy and compute efficiency.
Related papers
- Deep Homography Estimation for Visual Place Recognition [49.235432979736395]
We propose a transformer-based deep homography estimation (DHE) network.
It takes the dense feature map extracted by a backbone network as input and fits homography for fast and learnable geometric verification.
Experiments on benchmark datasets show that our method can outperform several state-of-the-art methods.
arXiv Detail & Related papers (2024-02-25T13:22:17Z) - EventTransAct: A video transformer-based framework for Event-camera
based action recognition [52.537021302246664]
Event cameras offer new opportunities compared to standard action recognition in RGB videos.
In this study, we employ a computationally efficient model, namely the video transformer network (VTN), which initially acquires spatial embeddings per event-frame.
In order to better adopt the VTN for the sparse and fine-grained nature of event data, we design Event-Contrastive Loss ($mathcalL_EC$) and event-specific augmentations.
arXiv Detail & Related papers (2023-08-25T23:51:07Z) - HODN: Disentangling Human-Object Feature for HOI Detection [51.48164941412871]
We propose a Human and Object Disentangling Network (HODN) to model the Human-Object Interaction (HOI) relationships explicitly.
Considering that human features are more contributive to interaction, we propose a Human-Guide Linking method to make sure the interaction decoder focuses on the human-centric regions.
Our proposed method achieves competitive performance on both the V-COCO and the HICO-Det Linking datasets.
arXiv Detail & Related papers (2023-08-20T04:12:50Z) - Agile gesture recognition for capacitive sensing devices: adapting
on-the-job [55.40855017016652]
We demonstrate a hand gesture recognition system that uses signals from capacitive sensors embedded into the etee hand controller.
The controller generates real-time signals from each of the wearer five fingers.
We use a machine learning technique to analyse the time series signals and identify three features that can represent 5 fingers within 500 ms.
arXiv Detail & Related papers (2023-05-12T17:24:02Z) - Online Recognition of Incomplete Gesture Data to Interface Collaborative
Robots [0.0]
This paper introduces an HRI framework to classify large vocabularies of interwoven static gestures (SGs) and dynamic gestures (DGs) captured with wearable sensors.
The recognized gestures are used to teleoperate a robot in a collaborative process that consists of preparing a breakfast meal.
arXiv Detail & Related papers (2023-04-13T18:49:08Z) - Towards Domain-Independent and Real-Time Gesture Recognition Using
mmWave Signal [11.76969975145963]
DI-Gesture is a domain-independent and real-time mmWave gesture recognition system.
In real-time scenario, the accuracy of DI-Gesutre reaches over 97% with average inference time of 2.87ms.
arXiv Detail & Related papers (2021-11-11T13:28:28Z) - Skeleton-Based Mutually Assisted Interacted Object Localization and
Human Action Recognition [111.87412719773889]
We propose a joint learning framework for "interacted object localization" and "human action recognition" based on skeleton data.
Our method achieves the best or competitive performance with the state-of-the-art methods for human action recognition.
arXiv Detail & Related papers (2021-10-28T10:09:34Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z) - Gesture Recognition from Skeleton Data for Intuitive Human-Machine
Interaction [0.6875312133832077]
We propose an approach for segmentation and classification of dynamic gestures based on a set of handcrafted features.
The method for gesture recognition applies a sliding window, which extracts information from both the spatial and temporal dimensions.
At the end, the recognized gestures are used to interact with a collaborative robot.
arXiv Detail & Related papers (2020-08-26T11:28:50Z) - Attention-Oriented Action Recognition for Real-Time Human-Robot
Interaction [11.285529781751984]
We propose an attention-oriented multi-level network framework to meet the need for real-time interaction.
Specifically, a Pre-Attention network is employed to roughly focus on the interactor in the scene at low resolution.
The other compact CNN receives the extracted skeleton sequence as input for action recognition.
arXiv Detail & Related papers (2020-07-02T12:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.