BBC-Oxford British Sign Language Dataset
- URL: http://arxiv.org/abs/2111.03635v1
- Date: Fri, 5 Nov 2021 17:35:58 GMT
- Title: BBC-Oxford British Sign Language Dataset
- Authors: Samuel Albanie, G\"ul Varol, Liliane Momeni, Hannah Bull,
Triantafyllos Afouras, Himel Chowdhury, Neil Fox, Bencie Woll, Rob Cooper,
Andrew McParland, Andrew Zisserman
- Abstract summary: We introduce the BBC-Oxford British Sign Language (BOBSL) dataset, a large-scale video collection of British Sign Language (BSL)
We describe the motivation for the dataset, together with statistics and available annotations.
We conduct experiments to provide baselines for the tasks of sign recognition, sign language alignment, and sign language translation.
- Score: 64.32108826673183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce the BBC-Oxford British Sign Language (BOBSL)
dataset, a large-scale video collection of British Sign Language (BSL). BOBSL
is an extended and publicly released dataset based on the BSL-1K dataset
introduced in previous work. We describe the motivation for the dataset,
together with statistics and available annotations. We conduct experiments to
provide baselines for the tasks of sign recognition, sign language alignment,
and sign language translation. Finally, we describe several strengths and
limitations of the data from the perspectives of machine learning and
linguistics, note sources of bias present in the dataset, and discuss potential
applications of BOBSL in the context of sign language technology. The dataset
is available at https://www.robots.ox.ac.uk/~vgg/data/bobsl/.
Related papers
- SCOPE: Sign Language Contextual Processing with Embedding from LLMs [49.5629738637893]
Sign languages, used by around 70 million Deaf individuals globally, are visual languages that convey visual and contextual information.
Current methods in vision-based sign language recognition ( SLR) and translation (SLT) struggle with dialogue scenes due to limited dataset diversity and the neglect of contextually relevant information.
We introduce SCOPE, a novel context-aware vision-based SLR and SLT framework.
arXiv Detail & Related papers (2024-09-02T08:56:12Z) - BdSLW60: A Word-Level Bangla Sign Language Dataset [3.8631510994883254]
We create a comprehensive BdSL word-level dataset named BdSLW60 in an unconstrained and natural setting.
The dataset encompasses 60 Bangla sign words, with a significant scale of 9307 video trials provided by 18 signers under the supervision of a sign language professional.
We report the benchmarking of our BdSLW60 dataset using the Support Vector Machine (SVM) with testing accuracy up to 67.6% and an attention-based bi-LSTM with testing accuracy up to 75.1%.
arXiv Detail & Related papers (2024-02-13T18:02:58Z) - Robotic Skill Acquisition via Instruction Augmentation with
Vision-Language Models [70.82705830137708]
We introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL)
We utilize semi-language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data.
DIAL enables imitation learning policies to acquire new capabilities and generalize to 60 novel instructions unseen in the original dataset.
arXiv Detail & Related papers (2022-11-21T18:56:00Z) - LSA-T: The first continuous Argentinian Sign Language dataset for Sign
Language Translation [52.87578398308052]
Sign language translation (SLT) is an active field of study that encompasses human-computer interaction, computer vision, natural language processing and machine learning.
This paper presents the first continuous Argentinian Sign Language (LSA) dataset.
It contains 14,880 sentence level videos of LSA extracted from the CN Sordos YouTube channel with labels and keypoints annotations for each signer.
arXiv Detail & Related papers (2022-11-14T14:46:44Z) - Speech-to-Speech Translation For A Real-world Unwritten Language [62.414304258701804]
We study speech-to-speech translation (S2ST) that translates speech from one language into another language.
We present an end-to-end solution from training data collection, modeling choices to benchmark dataset release.
arXiv Detail & Related papers (2022-11-11T20:21:38Z) - SDW-ASL: A Dynamic System to Generate Large Scale Dataset for Continuous
American Sign Language [0.0]
We release the first version of our ASL dataset, which contains 30k sentences, 416k words, a vocabulary of 18k words, in a total of 104 hours.
This is the largest continuous sign language dataset published to date in terms of video duration.
arXiv Detail & Related papers (2022-10-13T07:08:00Z) - ASL-Homework-RGBD Dataset: An annotated dataset of 45 fluent and
non-fluent signers performing American Sign Language homeworks [32.3809065803553]
This dataset contains videos of fluent and non-fluent signers using American Sign Language (ASL)
A total of 45 fluent and non-fluent participants were asked to perform signing homework assignments.
The data is annotated to identify several aspects of signing including grammatical features and non-manual markers.
arXiv Detail & Related papers (2022-07-08T17:18:49Z) - Towards Large-Scale Data Mining for Data-Driven Analysis of Sign
Languages [0.0]
We show that it is possible to collect the data from social networking services such as TikTok, Instagram, and YouTube.
Using our data collection pipeline, we collect and examine the interpretation of songs in both the American Sign Language (ASL) and the Brazilian Sign Language (Libras)
arXiv Detail & Related papers (2020-06-03T09:28:17Z) - A Study of Cross-Lingual Ability and Language-specific Information in
Multilingual BERT [60.9051207862378]
multilingual BERT works remarkably well on cross-lingual transfer tasks.
Datasize and context window size are crucial factors to the transferability.
There is a computationally cheap but effective approach to improve the cross-lingual ability of multilingual BERT.
arXiv Detail & Related papers (2020-04-20T11:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.