Related papers: Sign Stitching: A Novel Approach to Sign Language Production

Sign Stitching: A Novel Approach to Sign Language Production

URL: http://arxiv.org/abs/2405.07663v1
Date: Mon, 13 May 2024 11:44:57 GMT
Title: Sign Stitching: A Novel Approach to Sign Language Production
Authors: Harry Walsh, Ben Saunders, Richard Bowden,
Abstract summary: We propose using dictionary examples and a learnt codebook of facial expressions to create expressive sign language sequences. By normalizing each sign into a canonical pose, cropping, and stitching we create a continuous sequence. We leverage a SignGAN model to map the output to a photo-realistic signer and present a complete Text-to-Sign (T2S) SLP pipeline.
Score: 35.35777909051466
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Sign Language Production (SLP) is a challenging task, given the limited resources available and the inherent diversity within sign data. As a result, previous works have suffered from the problem of regression to the mean, leading to under-articulated and incomprehensible signing. In this paper, we propose using dictionary examples and a learnt codebook of facial expressions to create expressive sign language sequences. However, simply concatenating signs and adding the face creates robotic and unnatural sequences. To address this we present a 7-step approach to effectively stitch sequences together. First, by normalizing each sign into a canonical pose, cropping, and stitching we create a continuous sequence. Then, by applying filtering in the frequency domain and resampling each sign, we create cohesive natural sequences that mimic the prosody found in the original data. We leverage a SignGAN model to map the output to a photo-realistic signer and present a complete Text-to-Sign (T2S) SLP pipeline. Our evaluation demonstrates the effectiveness of the approach, showcasing state-of-the-art performance across all datasets. Finally, a user evaluation shows our approach outperforms the baseline model and is capable of producing realistic sign language sequences.

Related papers

Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator [55.94334001112357]
We introduce a multilingual sign language model, Signs as Tokens (SOKE), which can generate 3D sign avatars autoregressively from text inputs. We propose a retrieval-enhanced SLG approach, which incorporates external sign dictionaries to provide accurate word-level signs.
arXiv Detail & Related papers (2024-11-26T18:28:09Z)
MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production [93.32354378820648]
We propose a unified framework for continuous sign language production, easing communication between sign and non-sign language users. A sequence diffusion model, utilizing embeddings extracted from text or speech, is crafted to generate sign predictions step by step. Experiments on How2Sign and PHOENIX14T datasets demonstrate that our model achieves competitive performance in sign language production.
arXiv Detail & Related papers (2024-07-04T13:53:50Z)
A Data-Driven Representation for Sign Language Production [26.520016084139964]
Sign Language Production aims to automatically translate spoken language sentences into continuous sequences of sign language. Current state-of-the-art approaches rely on scarce linguistic resources to work. This paper introduces an innovative solution by transforming the continuous pose generation problem into a discrete sequence generation problem.
arXiv Detail & Related papers (2024-04-17T15:52:38Z)
Improving Continuous Sign Language Recognition with Cross-Lingual Signs [29.077175863743484]
We study the feasibility of utilizing multilingual sign language corpora to facilitate continuous sign language recognition. We first build two sign language dictionaries containing isolated signs that appear in two datasets. Then we identify the sign-to-sign mappings between two sign languages via a well-optimized isolated sign language recognition model.
arXiv Detail & Related papers (2023-08-21T15:58:47Z)
SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign Language Understanding [132.78015553111234]
Hand gesture serves as a crucial role during the expression of sign language. Current deep learning based methods for sign language understanding (SLU) are prone to over-fitting due to insufficient sign data resource. We propose the first self-supervised pre-trainable SignBERT+ framework with model-aware hand prior incorporated.
arXiv Detail & Related papers (2023-05-08T17:16:38Z)
Word separation in continuous sign language using isolated signs and post-processing [47.436298331905775]
We propose a two-stage model for Continuous Sign Language Recognition. In the first stage, the predictor model, which includes a combination of CNN, SVD, and LSTM, is trained with the isolated signs. In the second stage, we apply a post-processing algorithm to the Softmax outputs obtained from the first part of the model.
arXiv Detail & Related papers (2022-04-02T18:34:33Z)
SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign Language Recognition [94.30084702921529]
Hand gesture serves as a critical role in sign language. Current deep-learning-based sign language recognition methods may suffer insufficient interpretability. We introduce the first self-supervised pre-trainable SignBERT with incorporated hand prior for SLR.
arXiv Detail & Related papers (2021-10-11T16:18:09Z)
Watch, read and lookup: learning to spot signs from multiple supervisors [99.50956498009094]
Given a video of an isolated sign, our task is to identify whether and where it has been signed in a continuous, co-articulated sign language video. We train a model using multiple types of available supervision by: (1) watching existing sparsely labelled footage; (2) reading associated subtitles which provide additional weak-supervision; and (3) looking up words in visual sign language dictionaries. These three tasks are integrated into a unified learning framework using the principles of Noise Contrastive Estimation and Multiple Instance Learning.
arXiv Detail & Related papers (2020-10-08T14:12:56Z)
Adversarial Training for Multi-Channel Sign Language Production [43.45785951443149]
We propose an Adversarial Multi-Channel approach to Sign Language Production. We frame sign production as a minimax game between a transformer-based Generator and a conditional Discriminator. Our adversarial discriminator evaluates the realism of sign production conditioned on the source text, pushing the generator towards a realistic and articulate output.
arXiv Detail & Related papers (2020-08-27T23:05:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.