Improving Sign Recognition with Phonology
- URL: http://arxiv.org/abs/2302.05759v1
- Date: Sat, 11 Feb 2023 18:51:23 GMT
- Title: Improving Sign Recognition with Phonology
- Authors: Lee Kezar, Jesse Thomason, Zed Sevcikova Sehyr
- Abstract summary: We use insights from research on American Sign Language phonology to train models for isolated sign language recognition.
We train ISLR models that take in pose estimations of a signer producing a single sign to predict not only the sign but additionally its phonological characteristics.
These auxiliary predictions lead to a nearly 9% absolute gain in sign recognition accuracy on the WLASL benchmark.
- Score: 8.27285154257448
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We use insights from research on American Sign Language (ASL) phonology to
train models for isolated sign language recognition (ISLR), a step towards
automatic sign language understanding. Our key insight is to explicitly
recognize the role of phonology in sign production to achieve more accurate
ISLR than existing work which does not consider sign language phonology. We
train ISLR models that take in pose estimations of a signer producing a single
sign to predict not only the sign but additionally its phonological
characteristics, such as the handshape. These auxiliary predictions lead to a
nearly 9% absolute gain in sign recognition accuracy on the WLASL benchmark,
with consistent improvements in ISLR regardless of the underlying prediction
model architecture. This work has the potential to accelerate linguistic
research in the domain of signed languages and reduce communication barriers
between deaf and hearing people.
Related papers
- Scaling up Multimodal Pre-training for Sign Language Understanding [96.17753464544604]
Sign language serves as the primary meaning of communication for the deaf-mute community.
To facilitate communication between the deaf-mute and hearing people, a series of sign language understanding (SLU) tasks have been studied.
These tasks investigate sign language topics from diverse perspectives and raise challenges in learning effective representation of sign language videos.
arXiv Detail & Related papers (2024-08-16T06:04:25Z) - PhonologyBench: Evaluating Phonological Skills of Large Language Models [57.80997670335227]
Phonology, the study of speech's structure and pronunciation rules, is a critical yet often overlooked component in Large Language Model (LLM) research.
We present PhonologyBench, a novel benchmark consisting of three diagnostic tasks designed to explicitly test the phonological skills of LLMs.
We observe a significant gap of 17% and 45% on Rhyme Word Generation and Syllable counting, respectively, when compared to humans.
arXiv Detail & Related papers (2024-04-03T04:53:14Z) - The Sem-Lex Benchmark: Modeling ASL Signs and Their Phonemes [6.0179345110920455]
We introduce a new resource for American Sign Language (ASL) modeling, the Sem-Lex Benchmark.
The Benchmark is the current largest of its kind, consisting of over 84k videos of isolated sign productions from deaf ASL signers who gave informed consent and received compensation.
We present a suite of experiments which make use of the linguistic information in ASL-LEX, evaluating the practicality and fairness of the Sem-Lex Benchmark for isolated sign recognition (ISR)
arXiv Detail & Related papers (2023-09-30T00:25:43Z) - Improving Continuous Sign Language Recognition with Cross-Lingual Signs [29.077175863743484]
We study the feasibility of utilizing multilingual sign language corpora to facilitate continuous sign language recognition.
We first build two sign language dictionaries containing isolated signs that appear in two datasets.
Then we identify the sign-to-sign mappings between two sign languages via a well-optimized isolated sign language recognition model.
arXiv Detail & Related papers (2023-08-21T15:58:47Z) - On the Importance of Signer Overlap for Sign Language Detection [65.26091369630547]
We argue that the current benchmark data sets for sign language detection estimate overly positive results that do not generalize well.
We quantify this with a detailed analysis of the effect of signer overlap on current sign detection benchmark data sets.
We propose new data set partitions that are free of overlap and allow for more realistic performance assessment.
arXiv Detail & Related papers (2023-03-19T22:15:05Z) - Classification of Phonological Parameters in Sign Languages [0.0]
Linguistic research often breaks down signs into constituent parts to study sign languages.
We show how a single model can be used to recognise the individual phonological parameters within sign languages.
arXiv Detail & Related papers (2022-05-24T13:40:45Z) - Signing at Scale: Learning to Co-Articulate Signs for Large-Scale
Photo-Realistic Sign Language Production [43.45785951443149]
Sign languages are visual languages, with vocabularies as rich as their spoken language counterparts.
Current deep-learning based Sign Language Production (SLP) models produce under-articulated skeleton pose sequences.
We tackle large-scale SLP by learning to co-articulate between dictionary signs.
We also propose SignGAN, a pose-conditioned human synthesis model that produces photo-realistic sign language videos.
arXiv Detail & Related papers (2022-03-29T08:51:38Z) - WLASL-LEX: a Dataset for Recognising Phonological Properties in American
Sign Language [2.814213966364155]
We build a large-scale dataset of American Sign Language signs annotated with six different phonological properties.
We investigate whether data-driven end-to-end and feature-based approaches can be optimised to automatically recognise these properties.
arXiv Detail & Related papers (2022-03-11T17:21:24Z) - All You Need In Sign Language Production [50.3955314892191]
Sign language recognition and production need to cope with some critical challenges.
We present an introduction to the Deaf culture, Deaf centers, psychological perspective of sign language.
Also, the backbone architectures and methods in SLP are briefly introduced and the proposed taxonomy on SLP is presented.
arXiv Detail & Related papers (2022-01-05T13:45:09Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - BSL-1K: Scaling up co-articulated sign language recognition using
mouthing cues [106.21067543021887]
We show how to use mouthing cues from signers to obtain high-quality annotations from video data.
The BSL-1K dataset is a collection of British Sign Language (BSL) signs of unprecedented scale.
arXiv Detail & Related papers (2020-07-23T16:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.