One-shot lip-based biometric authentication: extending behavioral
features with authentication phrase information
- URL: http://arxiv.org/abs/2308.06944v1
- Date: Mon, 14 Aug 2023 05:34:36 GMT
- Title: One-shot lip-based biometric authentication: extending behavioral
features with authentication phrase information
- Authors: Brando Koch, Ratko Grbi\'c
- Abstract summary: Lip-based biometric authentication (LBBA) is an authentication method based on a person's lip movements during speech in the form of video data captured by a camera sensor.
LBBA can utilize both physical and behavioral characteristics of lip movements without requiring any additional sensory equipment apart from an RGB camera.
- Score: 3.038642416291856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lip-based biometric authentication (LBBA) is an authentication method based
on a person's lip movements during speech in the form of video data captured by
a camera sensor. LBBA can utilize both physical and behavioral characteristics
of lip movements without requiring any additional sensory equipment apart from
an RGB camera. State-of-the-art (SOTA) approaches use one-shot learning to
train deep siamese neural networks which produce an embedding vector out of
these features. Embeddings are further used to compute the similarity between
an enrolled user and a user being authenticated. A flaw of these approaches is
that they model behavioral features as style-of-speech without relation to what
is being said. This makes the system vulnerable to video replay attacks of the
client speaking any phrase. To solve this problem we propose a one-shot
approach which models behavioral features to discriminate against what is being
said in addition to style-of-speech. We achieve this by customizing the GRID
dataset to obtain required triplets and training a siamese neural network based
on 3D convolutions and recurrent neural network layers. A custom triplet loss
for batch-wise hard-negative mining is proposed. Obtained results using an
open-set protocol are 3.2% FAR and 3.8% FRR on the test set of the customized
GRID dataset. Additional analysis of the results was done to quantify the
influence and discriminatory power of behavioral and physical features for
LBBA.
Related papers
- LipSim: A Provably Robust Perceptual Similarity Metric [56.03417732498859]
We show the vulnerability of state-of-the-art perceptual similarity metrics based on an ensemble of ViT-based feature extractors to adversarial attacks.
We then propose a framework to train a robust perceptual similarity metric called LipSim with provable guarantees.
LipSim provides guarded areas around each data point and certificates for all perturbations within an $ell$ ball.
arXiv Detail & Related papers (2023-10-27T16:59:51Z) - Free-text Keystroke Authentication using Transformers: A Comparative
Study of Architectures and Loss Functions [1.0152838128195467]
Keystroke biometrics is a promising approach for user identification and verification, leveraging the unique patterns in individuals' typing behavior.
We propose a Transformer-based network that employs self-attention to extract informative features from keystroke sequences.
Our model surpasses the previous state-of-the-art in free-text keystroke authentication.
arXiv Detail & Related papers (2023-10-18T00:34:26Z) - Privacy Preserving Machine Learning for Behavioral Authentication
Systems [0.0]
A behavioral authentication (BA) system uses the behavioral characteristics of users to verify their identity claims.
Similar to other neural network (NN) architectures, the NN classifier of the BA system is vulnerable to privacy attacks.
We introduce an ML-based privacy attack, and our proposed system is robust against this and other privacy and security attacks.
arXiv Detail & Related papers (2023-08-31T19:15:26Z) - An Approach for Improving Automatic Mouth Emotion Recognition [1.5293427903448025]
The study proposes and tests a technique for automated emotion recognition through mouth detection via Convolutional Neural Networks (CNN)
The technique is meant to be applied for supporting people with health disorders with communication skills issues.
arXiv Detail & Related papers (2022-12-12T16:17:21Z) - Privacy-Preserved Neural Graph Similarity Learning [99.78599103903777]
We propose a novel Privacy-Preserving neural Graph Matching network model, named PPGM, for graph similarity learning.
To prevent reconstruction attacks, the proposed model does not communicate node-level representations between devices.
To alleviate the attacks to graph properties, the obfuscated features that contain information from both vectors are communicated.
arXiv Detail & Related papers (2022-10-21T04:38:25Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z) - Exploring Deep Learning for Joint Audio-Visual Lip Biometrics [54.32039064193566]
Audio-visual (AV) lip biometrics is a promising authentication technique that leverages the benefits of both the audio and visual modalities in speech communication.
The lack of a sizeable AV database hinders the exploration of deep-learning-based audio-visual lip biometrics.
We establish the DeepLip AV lip biometrics system realized with a convolutional neural network (CNN) based video module, a time-delay neural network (TDNN) based audio module, and a multimodal fusion module.
arXiv Detail & Related papers (2021-04-17T10:51:55Z) - Activity Recognition with Moving Cameras and Few Training Examples:
Applications for Detection of Autism-Related Headbanging [1.603589863010401]
Activity recognition computer vision algorithms can be used to detect the presence of autism-related behaviors.
We document the advantages and limitations of current feature representation techniques for activity recognition when applied to head banging detection.
We create a computer vision classifier for detecting head banging in home videos using a time-distributed convolutional neural network.
arXiv Detail & Related papers (2021-01-10T05:37:05Z) - Generalized Iris Presentation Attack Detection Algorithm under
Cross-Database Settings [63.90855798947425]
Presentation attacks pose major challenges to most of the biometric modalities.
We propose a generalized deep learning-based presentation attack detection network, MVANet.
It is inspired by the simplicity and success of hybrid algorithm or fusion of multiple detection networks.
arXiv Detail & Related papers (2020-10-25T22:42:27Z) - 3D Facial Matching by Spiral Convolutional Metric Learning and a
Biometric Fusion-Net of Demographic Properties [0.0]
Face recognition is a widely accepted biometric verification tool, as the face contains a lot of information about the identity of a person.
In this study, a 2-step neural-based pipeline is presented for matching 3D facial shape to multiple DNA-related properties.
Results obtained by a 10-fold cross-validation for biometric verification show that combining multiple properties leads to stronger biometric systems.
arXiv Detail & Related papers (2020-09-10T09:31:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.