Sign Stitching: A Novel Approach to Sign Language Production
- URL: http://arxiv.org/abs/2405.07663v1
- Date: Mon, 13 May 2024 11:44:57 GMT
- Title: Sign Stitching: A Novel Approach to Sign Language Production
- Authors: Harry Walsh, Ben Saunders, Richard Bowden,
- Abstract summary: We propose using dictionary examples and a learnt codebook of facial expressions to create expressive sign language sequences.
By normalizing each sign into a canonical pose, cropping, and stitching we create a continuous sequence.
We leverage a SignGAN model to map the output to a photo-realistic signer and present a complete Text-to-Sign (T2S) SLP pipeline.
- Score: 35.35777909051466
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Sign Language Production (SLP) is a challenging task, given the limited resources available and the inherent diversity within sign data. As a result, previous works have suffered from the problem of regression to the mean, leading to under-articulated and incomprehensible signing. In this paper, we propose using dictionary examples and a learnt codebook of facial expressions to create expressive sign language sequences. However, simply concatenating signs and adding the face creates robotic and unnatural sequences. To address this we present a 7-step approach to effectively stitch sequences together. First, by normalizing each sign into a canonical pose, cropping, and stitching we create a continuous sequence. Then, by applying filtering in the frequency domain and resampling each sign, we create cohesive natural sequences that mimic the prosody found in the original data. We leverage a SignGAN model to map the output to a photo-realistic signer and present a complete Text-to-Sign (T2S) SLP pipeline. Our evaluation demonstrates the effectiveness of the approach, showcasing state-of-the-art performance across all datasets. Finally, a user evaluation shows our approach outperforms the baseline model and is capable of producing realistic sign language sequences.
Related papers
- MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production [93.32354378820648]
We propose a unified framework for continuous sign language production, easing communication between sign and non-sign language users.
A sequence diffusion model, utilizing embeddings extracted from text or speech, is crafted to generate sign predictions step by step.
Experiments on How2Sign and PHOENIX14T datasets demonstrate that our model achieves competitive performance in sign language production.
arXiv Detail & Related papers (2024-07-04T13:53:50Z) - A Data-Driven Representation for Sign Language Production [26.520016084139964]
Sign Language Production aims to automatically translate spoken language sentences into continuous sequences of sign language.
Current state-of-the-art approaches rely on scarce linguistic resources to work.
This paper introduces an innovative solution by transforming the continuous pose generation problem into a discrete sequence generation problem.
arXiv Detail & Related papers (2024-04-17T15:52:38Z) - SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign
Language Understanding [132.78015553111234]
Hand gesture serves as a crucial role during the expression of sign language.
Current deep learning based methods for sign language understanding (SLU) are prone to over-fitting due to insufficient sign data resource.
We propose the first self-supervised pre-trainable SignBERT+ framework with model-aware hand prior incorporated.
arXiv Detail & Related papers (2023-05-08T17:16:38Z) - Word separation in continuous sign language using isolated signs and
post-processing [47.436298331905775]
We propose a two-stage model for Continuous Sign Language Recognition.
In the first stage, the predictor model, which includes a combination of CNN, SVD, and LSTM, is trained with the isolated signs.
In the second stage, we apply a post-processing algorithm to the Softmax outputs obtained from the first part of the model.
arXiv Detail & Related papers (2022-04-02T18:34:33Z) - Signing at Scale: Learning to Co-Articulate Signs for Large-Scale
Photo-Realistic Sign Language Production [43.45785951443149]
Sign languages are visual languages, with vocabularies as rich as their spoken language counterparts.
Current deep-learning based Sign Language Production (SLP) models produce under-articulated skeleton pose sequences.
We tackle large-scale SLP by learning to co-articulate between dictionary signs.
We also propose SignGAN, a pose-conditioned human synthesis model that produces photo-realistic sign language videos.
arXiv Detail & Related papers (2022-03-29T08:51:38Z) - SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign
Language Recognition [94.30084702921529]
Hand gesture serves as a critical role in sign language.
Current deep-learning-based sign language recognition methods may suffer insufficient interpretability.
We introduce the first self-supervised pre-trainable SignBERT with incorporated hand prior for SLR.
arXiv Detail & Related papers (2021-10-11T16:18:09Z) - COCO-LM: Correcting and Contrasting Text Sequences for Language Model
Pretraining [59.169836983883656]
COCO-LM is a new self-supervised learning framework that pretrains Language Models by COrrecting challenging errors and COntrasting text sequences.
COCO-LM employs an auxiliary language model to mask-and-predict tokens in original text sequences.
Our analyses reveal that COCO-LM's advantages come from its challenging training signals, more contextualized token representations, and regularized sequence representations.
arXiv Detail & Related papers (2021-02-16T22:24:29Z) - Everybody Sign Now: Translating Spoken Language to Photo Realistic Sign
Language Video [43.45785951443149]
To be truly understandable by Deaf communities, an automatic Sign Language Production system must generate a photo-realistic signer.
We propose SignGAN, the first SLP model to produce photo-realistic continuous sign language videos directly from spoken language.
A pose-conditioned human synthesis model is then introduced to generate a photo-realistic sign language video from the skeletal pose sequence.
arXiv Detail & Related papers (2020-11-19T14:31:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.