Modeling Global Body Configurations in American Sign Language
- URL: http://arxiv.org/abs/2009.01468v1
- Date: Thu, 3 Sep 2020 06:20:10 GMT
- Title: Modeling Global Body Configurations in American Sign Language
- Authors: Nicholas Wilkins, Beck Cordes Galbraith, Ifeoma Nwogu
- Abstract summary: American Sign Language (ASL) is the fourth most commonly used language in the United States.
ASL is the language most commonly used by Deaf people in the United States and the English-speaking regions of Canada.
- Score: 2.8575516056239576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: American Sign Language (ASL) is the fourth most commonly used language in the
United States and is the language most commonly used by Deaf people in the
United States and the English-speaking regions of Canada. Unfortunately, until
recently, ASL received little research. This is due, in part, to its delayed
recognition as a language until William C. Stokoe's publication in 1960.
Limited data has been a long-standing obstacle to ASL research and
computational modeling. The lack of large-scale datasets has prohibited many
modern machine-learning techniques, such as Neural Machine Translation, from
being applied to ASL. In addition, the modality required to capture sign
language (i.e. video) is complex in natural settings (as one must deal with
background noise, motion blur, and the curse of dimensionality). Finally, when
compared with spoken languages, such as English, there has been limited
research conducted into the linguistics of ASL.
We realize a simplified version of Liddell and Johnson's Movement-Hold (MH)
Model using a Probabilistic Graphical Model (PGM). We trained our model on
ASLing, a dataset collected from three fluent ASL signers. We evaluate our PGM
against other models to determine its ability to model ASL. Finally, we
interpret various aspects of the PGM and draw conclusions about ASL phonetics.
The main contributions of this paper are
Related papers
- Towards Robust Speech Representation Learning for Thousands of Languages [77.2890285555615]
Self-supervised learning (SSL) has helped extend speech technologies to more languages by reducing the need for labeled data.
We propose XEUS, a Cross-lingual for Universal Speech, trained on over 1 million hours of data across 4057 languages.
arXiv Detail & Related papers (2024-06-30T21:40:26Z) - SignSpeak: Open-Source Time Series Classification for ASL Translation [0.12499537119440243]
We propose a low-cost, real-time ASL-to-speech translation glove and an exhaustive training dataset of sign language patterns.
We benchmarked this dataset with supervised learning models, such as LSTMs, GRUs and Transformers, where our best model achieved 92% accuracy.
Our open-source dataset, models and glove designs provide an accurate and efficient ASL translator while maintaining cost-effectiveness.
arXiv Detail & Related papers (2024-06-27T17:58:54Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models [60.09618700199927]
We propose adaptation methods which integrate LoRA to existed SSL models to extend new language.
We also develop preservation strategies which include data combination and re-clustering to retain abilities on existed languages.
arXiv Detail & Related papers (2024-06-20T08:13:30Z) - Evaluating Self-Supervised Speech Representations for Indigenous
American Languages [6.235388047623929]
We present an ASR corpus for Quechua, an indigenous South American Language.
We benchmark the efficacy of large SSL models on Quechua, along with 6 other indigenous languages such as Guarani and Bribri, on low-resource ASR.
Our results show surprisingly strong performance by state-of-the-art SSL models, showing the potential generalizability of large-scale models to real-world data.
arXiv Detail & Related papers (2023-10-05T16:11:14Z) - Joint Prediction and Denoising for Large-scale Multilingual
Self-supervised Learning [69.77973092264338]
We show that more powerful techniques can lead to more efficient pre-training, opening SSL to more research groups.
We propose WavLabLM, which extends WavLM's joint prediction and denoising to 40k hours of data across 136 languages.
We show that further efficiency can be achieved with a vanilla HuBERT Base model, which can maintain 94% of XLS-R's performance with only 3% of the data.
arXiv Detail & Related papers (2023-09-26T23:55:57Z) - SignDiff: Learning Diffusion Models for American Sign Language
Production [27.899654531461238]
The field of Sign Language Production lacked a large-scale, pre-trained model based on deep learning for continuous American Sign Language (ASL) production in the past decade.
We propose SignDiff, a dual-condition diffusion pre-training model that can generate human sign language speakers from a skeleton pose.
Our ASLP method proposes two new improved modules and a new loss function to improve the accuracy and quality of sign language skeletal posture.
arXiv Detail & Related papers (2023-08-30T15:14:56Z) - SpeechGLUE: How Well Can Self-Supervised Speech Models Capture
Linguistic Knowledge? [39.62926623310278]
Self-supervised learning (SSL) for speech representation has been successfully applied in various downstream tasks.
In this paper, we aim to clarify if speech SSL techniques can well capture linguistic knowledge.
arXiv Detail & Related papers (2023-06-14T09:04:29Z) - Learning Cross-lingual Visual Speech Representations [108.68531445641769]
Cross-lingual self-supervised visual representation learning has been a growing research topic in the last few years.
We use the recently-proposed Raw Audio-Visual Speechs (RAVEn) framework to pre-train an audio-visual model with unlabelled data.
Our experiments show that: (1) multi-lingual models with more data outperform monolingual ones, but, when keeping the amount of data fixed, monolingual models tend to reach better performance.
arXiv Detail & Related papers (2023-03-14T17:05:08Z) - SDW-ASL: A Dynamic System to Generate Large Scale Dataset for Continuous
American Sign Language [0.0]
We release the first version of our ASL dataset, which contains 30k sentences, 416k words, a vocabulary of 18k words, in a total of 104 hours.
This is the largest continuous sign language dataset published to date in terms of video duration.
arXiv Detail & Related papers (2022-10-13T07:08:00Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.