Topic Detection in Continuous Sign Language Videos
- URL: http://arxiv.org/abs/2209.02402v1
- Date: Thu, 1 Sep 2022 19:17:35 GMT
- Title: Topic Detection in Continuous Sign Language Videos
- Authors: Alvaro Budria, Laia Tarres, Gerard I. Gallego, Francesc Moreno-Noguer,
Jordi Torres, Xavier Giro-i-Nieto
- Abstract summary: We introduce the novel task of sign language topic detection.
We base our experiments on How2Sign, a large-scale video dataset spanning multiple semantic domains.
- Score: 23.43298383445439
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Significant progress has been made recently on challenging tasks in automatic
sign language understanding, such as sign language recognition, translation and
production. However, these works have focused on datasets with relatively few
samples, short recordings and limited vocabulary and signing space. In this
work, we introduce the novel task of sign language topic detection. We base our
experiments on How2Sign, a large-scale video dataset spanning multiple semantic
domains. We provide strong baselines for the task of topic detection and
present a comparison between different visual features commonly used in the
domain of sign language.
Related papers
- EvSign: Sign Language Recognition and Translation with Streaming Events [59.51655336911345]
Event camera could naturally perceive dynamic hand movements, providing rich manual clues for sign language tasks.
We propose efficient transformer-based framework for event-based SLR and SLT tasks.
Our method performs favorably against existing state-of-the-art approaches with only 0.34% computational cost.
arXiv Detail & Related papers (2024-07-17T14:16:35Z) - Reconsidering Sentence-Level Sign Language Translation [2.099922236065961]
We show that for 33% of sentences in our sample, our fluent Deaf signer annotators were only able to understand key parts of the clip in light of discourse-level context.
These results underscore the importance of understanding and sanity checking examples when adapting machine learning to new domains.
arXiv Detail & Related papers (2024-06-16T19:19:54Z) - Linguistically Motivated Sign Language Segmentation [51.06873383204105]
We consider two kinds of segmentation: segmentation into individual signs and segmentation into phrases.
Our method is motivated by linguistic cues observed in sign language corpora.
We replace the predominant IO tagging scheme with BIO tagging to account for continuous signing.
arXiv Detail & Related papers (2023-10-21T10:09:34Z) - Improving Continuous Sign Language Recognition with Cross-Lingual Signs [29.077175863743484]
We study the feasibility of utilizing multilingual sign language corpora to facilitate continuous sign language recognition.
We first build two sign language dictionaries containing isolated signs that appear in two datasets.
Then we identify the sign-to-sign mappings between two sign languages via a well-optimized isolated sign language recognition model.
arXiv Detail & Related papers (2023-08-21T15:58:47Z) - CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive
Learning [38.83062453145388]
Sign language retrieval consists of two sub-tasks: text-to-sign-video (T2V) retrieval and sign-video-to-text (V2T) retrieval.
We take into account the linguistic properties of both sign languages and natural languages, and simultaneously identify the fine-grained cross-lingual mappings.
Our framework outperforms the pioneering method by large margins on various datasets.
arXiv Detail & Related papers (2023-03-22T17:59:59Z) - Automatic dense annotation of large-vocabulary sign language videos [85.61513254261523]
We propose a simple, scalable framework to vastly increase the density of automatic annotations.
We make these annotations publicly available to support the sign language research community.
arXiv Detail & Related papers (2022-08-04T17:55:09Z) - Scaling up sign spotting through sign language dictionaries [99.50956498009094]
The focus of this work is $textitsign spotting$ - given a video of an isolated sign, our task is to identify $textitwhether$ and $textitwhere$ it has been signed in a continuous, co-articulated sign language video.
We train a model using multiple types of available supervision by: (1) $textitwatching$ existing footage which is sparsely labelled using mouthing cues; (2) $textitreading$ associated subtitles which provide additional translations of the signed content.
We validate the effectiveness of our approach on low
arXiv Detail & Related papers (2022-05-09T10:00:03Z) - Towards Automatic Speech to Sign Language Generation [35.22004819666906]
We propose a multi-language transformer network trained to generate signer's poses from speech segments.
Our model learns to generate continuous sign pose sequences in an end-to-end manner.
arXiv Detail & Related papers (2021-06-24T06:44:19Z) - Watch, read and lookup: learning to spot signs from multiple supervisors [99.50956498009094]
Given a video of an isolated sign, our task is to identify whether and where it has been signed in a continuous, co-articulated sign language video.
We train a model using multiple types of available supervision by: (1) watching existing sparsely labelled footage; (2) reading associated subtitles which provide additional weak-supervision; and (3) looking up words in visual sign language dictionaries.
These three tasks are integrated into a unified learning framework using the principles of Noise Contrastive Estimation and Multiple Instance Learning.
arXiv Detail & Related papers (2020-10-08T14:12:56Z) - Sign Language Transformers: Joint End-to-end Sign Language Recognition
and Translation [59.38247587308604]
We introduce a novel transformer based architecture that jointly learns Continuous Sign Language Recognition and Translation.
We evaluate the recognition and translation performances of our approaches on the challenging RWTH-PHOENIX-Weather-2014T dataset.
Our translation networks outperform both sign video to spoken language and gloss to spoken language translation models.
arXiv Detail & Related papers (2020-03-30T21:35:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.