Quantitative Survey of the State of the Art in Sign Language Recognition
- URL: http://arxiv.org/abs/2008.09918v2
- Date: Sat, 29 Aug 2020 10:49:47 GMT
- Title: Quantitative Survey of the State of the Art in Sign Language Recognition
- Authors: Oscar Koller
- Abstract summary: This work presents a meta study covering around 300 published sign language recognition papers with over 400 experimental results.
It includes most papers between the start of the field in 1983 and 2020.
It also covers a fine-grained analysis on over 25 studies that have compared their recognition approaches on RWTH-PHOENIX-Weather 2014.
- Score: 7.284661356980247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents a meta study covering around 300 published sign language
recognition papers with over 400 experimental results. It includes most papers
between the start of the field in 1983 and 2020. Additionally, it covers a
fine-grained analysis on over 25 studies that have compared their recognition
approaches on RWTH-PHOENIX-Weather 2014, the standard benchmark task of the
field. Research in the domain of sign language recognition has progressed
significantly in the last decade, reaching a point where the task attracts much
more attention than ever before. This study compiles the state of the art in a
concise way to help advance the field and reveal open questions. Moreover, all
of this meta study's source data is made public, easing future work with it and
further expansion. The analyzed papers have been manually labeled with a set of
categories. The data reveals many insights, such as, among others, shifts in
the field from intrusive to non-intrusive capturing, from local to global
features and the lack of non-manual parameters included in medium and larger
vocabulary recognition systems. Surprisingly, RWTH-PHOENIX-Weather with a
vocabulary of 1080 signs represents the only resource for large vocabulary
continuous sign language recognition benchmarking world wide.
Related papers
- BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages [93.92804151830744]
We present BRIGHTER, a collection of multi-labeled, emotion-annotated datasets in 28 different languages.<n>We highlight the challenges related to the data collection and annotation processes.<n>We show that the BRIGHTER datasets represent a meaningful step towards addressing the gap in text-based emotion recognition.
arXiv Detail & Related papers (2025-02-17T15:39:50Z) - Text Classification using Graph Convolutional Networks: A Comprehensive Survey [11.1080224302799]
Graph convolution network (GCN)-based approaches have gained a lot of traction in this domain over the last decade.
This work aims to summarize and categorize various GCN-based Text Classification approaches with regard to the architecture and mode of supervision.
arXiv Detail & Related papers (2024-10-12T07:03:42Z) - A Network Analysis Approach to Conlang Research Literature [0.0]
This paper aims to have an overall understanding of the literature on conlang research.
Analysing over 2300 academic publications since 1927 until 2022, we have found that Esperanto is by far the most documented conlang.
arXiv Detail & Related papers (2024-07-22T04:40:45Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - Towards Open Vocabulary Learning: A Survey [146.90188069113213]
Deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection.
Recently, open vocabulary settings were proposed due to the rapid progress of vision language pre-training.
This paper provides a thorough review of open vocabulary learning, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2023-06-28T02:33:06Z) - Self-Supervised Speech Representation Learning: A Review [105.1545308184483]
Self-supervised representation learning methods promise a single universal model that would benefit a wide variety of tasks and domains.
Speech representation learning is experiencing similar progress in three main categories: generative, contrastive, and predictive methods.
This review presents approaches for self-supervised speech representation learning and their connection to other research areas.
arXiv Detail & Related papers (2022-05-21T16:52:57Z) - Machine Translation from Signed to Spoken Languages: State of the Art
and Challenges [9.292669129832605]
We give a high-level introduction to sign language linguistics and machine translation.
We find that significant advances have been made on the shoulders of spoken language machine translation research.
We advocate for interdisciplinary research and to base future research on linguistic analysis of sign languages.
arXiv Detail & Related papers (2022-02-07T11:54:07Z) - From Show to Tell: A Survey on Image Captioning [48.98681267347662]
Connecting Vision and Language plays an essential role in Generative Intelligence.
Research in image captioning has not reached a conclusive answer yet.
This work aims at providing a comprehensive overview and categorization of image captioning approaches.
arXiv Detail & Related papers (2021-07-14T18:00:54Z) - A Survey on Text Classification: From Shallow to Deep Learning [83.47804123133719]
The last decade has seen a surge of research in this area due to the unprecedented success of deep learning.
This paper fills the gap by reviewing the state-of-the-art approaches from 1961 to 2021.
We create a taxonomy for text classification according to the text involved and the models used for feature extraction and classification.
arXiv Detail & Related papers (2020-08-02T00:09:03Z) - On Vocabulary Reliance in Scene Text Recognition [79.21737876442253]
Methods perform well on images with words within vocabulary but generalize poorly to images with words outside vocabulary.
We call this phenomenon "vocabulary reliance"
We propose a simple yet effective mutual learning strategy to allow models of two families to learn collaboratively.
arXiv Detail & Related papers (2020-05-08T11:16:58Z) - BosphorusSign22k Sign Language Recognition Dataset [2.064612766965483]
BosphorusSign22k is a large scale sign language dataset aimed at computer vision, video recognition and deep learning research communities.
The primary objective of this dataset is to serve as a new benchmark in Turkish Sign Language Recognition.
We provide state-of-the-art human pose estimates to encourage other tasks such as Sign Language Production.
arXiv Detail & Related papers (2020-04-02T22:15:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.