Related papers: Comparison of Data Representations and Machine Learning Architectures for User Identification on Arbitrary Motion Sequences

Comparison of Data Representations and Machine Learning Architectures for User Identification on Arbitrary Motion Sequences

URL: http://arxiv.org/abs/2210.00527v1
Date: Sun, 2 Oct 2022 14:12:10 GMT
Title: Comparison of Data Representations and Machine Learning Architectures for User Identification on Arbitrary Motion Sequences
Authors: Christian Schell, Andreas Hotho, Marc Erich Latoschik
Abstract summary: This paper compares different machine learning approaches to identify users based on arbitrary sequences of head and hand movements. We publish all our code to allow and to provide baselines for future work. The model correctly identifies any of the 34 subjects with an accuracy of 100% within 150 seconds.
Score: 8.967985264567217
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Reliable and robust user identification and authentication are important and often necessary requirements for many digital services. It becomes paramount in social virtual reality (VR) to ensure trust, specifically in digital encounters with lifelike realistic-looking avatars as faithful replications of real persons. Recent research has shown great interest in providing new solutions that verify the identity of extended reality (XR) systems. This paper compares different machine learning approaches to identify users based on arbitrary sequences of head and hand movements, a data stream provided by the majority of today's XR systems. We compare three different potential representations of the motion data from heads and hands (scene-relative, body-relative, and body-relative velocities), and by comparing the performances of five different machine learning architectures (random forest, multilayer perceptron, fully recurrent neural network, long-short term memory, gated recurrent unit). We use the publicly available dataset "Talking with Hands" and publish all our code to allow reproducibility and to provide baselines for future work. After hyperparameter optimization, the combination of a long-short term memory architecture and body-relative data outperformed competing combinations: the model correctly identifies any of the 34 subjects with an accuracy of 100\% within 150 seconds. The code for models, training and evaluation is made publicly available. Altogether, our approach provides an effective foundation for behaviometric-based identification and authentication to guide researchers and practitioners.

Related papers

SDFR: Synthetic Data for Face Recognition Competition [51.9134406629509]
Large-scale face recognition datasets are collected by crawling the Internet and without individuals' consent, raising legal, ethical, and privacy concerns. Recently several works proposed generating synthetic face recognition datasets to mitigate concerns in web-crawled face recognition datasets. This paper presents the summary of the Synthetic Data for Face Recognition (SDFR) Competition held in conjunction with the 18th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2024) The SDFR competition was split into two tasks, allowing participants to train face recognition systems using new synthetic datasets and/or existing ones.
arXiv Detail & Related papers (2024-04-06T10:30:31Z)
Synthetic-To-Real Video Person Re-ID [57.937189569211505]
Person re-identification (Re-ID) is an important task and has significant applications for public security and information forensics. We investigate a novel and challenging setting of Re-ID, i.e., cross-domain video-based person Re-ID. We utilize synthetic video datasets as the source domain for training and real-world videos for testing.
arXiv Detail & Related papers (2024-02-03T10:19:21Z)
Efficient Gesture Recognition on Spiking Convolutional Networks Through Sensor Fusion of Event-Based and Depth Data [1.474723404975345]
This work proposes a Spiking Convolutional Neural Network, processing event- and depth data for gesture recognition. The network is simulated using the open-source neuromorphic computing framework LAVA for offline training and evaluation on an embedded system.
arXiv Detail & Related papers (2024-01-30T14:42:35Z)
DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering [126.00165445599764]
We present DNA-Rendering, a large-scale, high-fidelity repository of human performance data for neural actor rendering. Our dataset contains over 1500 human subjects, 5000 motion sequences, and 67.5M frames' data volume. We construct a professional multi-view system to capture data, which contains 60 synchronous cameras with max 4096 x 3000 resolution, 15 fps speed, and stern camera calibration steps.
arXiv Detail & Related papers (2023-07-19T17:58:03Z)
Dialogue-Contextualized Re-ranking for Medical History-Taking [5.039849340960835]
We present a two-stage re-ranking approach that helps close the training-inference gap by re-ranking the first-stage question candidates. We find that relative to the expert system, the best performance is achieved by our proposed global re-ranker with a transformer backbone.
arXiv Detail & Related papers (2023-04-04T17:31:32Z)
Versatile User Identification in Extended Reality using Pretrained Similarity-Learning [16.356961801884562]
We develop a similarity-learning model and pretrained it on the "Who Is Alyx?" dataset. In comparison with a traditional classification-learning baseline, our model shows superior performance. Our approach paves the way for easy integration of pretrained motion-based identification models in production XR systems.
arXiv Detail & Related papers (2023-02-15T08:26:24Z)
NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks. Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth. Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z)
A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition [11.928873764689458]
We conduct a comprehensive evaluation of popular deep learning approaches for emotion recognition. We show that long-range dependencies in the speech signal are critical for emotion recognition. Speed/rate augmentation offers the most robust performance gain across models.
arXiv Detail & Related papers (2022-11-09T17:27:03Z)
Facial Emotion Recognition using Deep Residual Networks in Real-World Environments [5.834678345946704]
We propose a facial feature extractor model trained on an in-the-wild and massively collected video dataset. The dataset consists of a million labelled frames and 2,616 thousand subjects. As temporal information is important to the emotion recognition domain, we utilise LSTM cells to capture the temporal dynamics in the data.
arXiv Detail & Related papers (2021-11-04T10:08:22Z)
Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos) The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z)
ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations. Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z)
Deep Learning for Person Re-identification: A Survey and Outlook [233.36948173686602]
Person re-identification (Re-ID) aims at retrieving a person of interest across multiple non-overlapping cameras. By dissecting the involved components in developing a person Re-ID system, we categorize it into the closed-world and open-world settings.
arXiv Detail & Related papers (2020-01-13T12:49:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.