論文の概要: BSL-1K: Scaling up co-articulated sign language recognition using
mouthing cues
- arxiv url: http://arxiv.org/abs/2007.12131v2
- Date: Wed, 13 Oct 2021 17:13:42 GMT
- ステータス: 処理完了
- システム内更新日: 2022-11-07 12:48:24.434727
- Title: BSL-1K: Scaling up co-articulated sign language recognition using
mouthing cues
- Title(参考訳): BSL-1K:口笛を用いた手話認識の高速化
- Authors: Samuel Albanie and G\"ul Varol and Liliane Momeni and Triantafyllos
Afouras and Joon Son Chung and Neil Fox and Andrew Zisserman
- Abstract要約: ビデオデータから高品質なアノテーションを得るために,シグナリングキューの使い方を示す。
BSL-1Kデータセット(英: BSL-1K dataset)は、イギリス手話(英: British Sign Language, BSL)の集合体である。
- 参考スコア(独自算出の注目度): 106.21067543021887
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent progress in fine-grained gesture and action classification, and
machine translation, point to the possibility of automated sign language
recognition becoming a reality. A key stumbling block in making progress
towards this goal is a lack of appropriate training data, stemming from the
high complexity of sign annotation and a limited supply of qualified
annotators. In this work, we introduce a new scalable approach to data
collection for sign recognition in continuous videos. We make use of
weakly-aligned subtitles for broadcast footage together with a keyword spotting
method to automatically localise sign-instances for a vocabulary of 1,000 signs
in 1,000 hours of video. We make the following contributions: (1) We show how
to use mouthing cues from signers to obtain high-quality annotations from video
data - the result is the BSL-1K dataset, a collection of British Sign Language
(BSL) signs of unprecedented scale; (2) We show that we can use BSL-1K to train
strong sign recognition models for co-articulated signs in BSL and that these
models additionally form excellent pretraining for other sign languages and
benchmarks - we exceed the state of the art on both the MSASL and WLASL
benchmarks. Finally, (3) we propose new large-scale evaluation sets for the
tasks of sign recognition and sign spotting and provide baselines which we hope
will serve to stimulate research in this area.
- Abstract(参考訳): 近年の細粒度なジェスチャーや動作の分類や機械翻訳の進歩は、手話の自動認識が現実になる可能性を示している。
We make the following contributions: (1) We show how to use mouthing cues from signers to obtain high-quality annotations from video data - the result is the BSL-1K dataset, a collection of British Sign Language (BSL) signs of unprecedented scale; (2) We show that we can use BSL-1K to train strong sign recognition models for co-articulated signs in BSL and that these models additionally form excellent pretraining for other sign languages and benchmarks - we exceed the state of the art on both the MSASL and WLASL benchmarks.
- Improving Continuous Sign Language Recognition with Cross-Lingual Signs [29.077175863743484]
論文 参考訳(メタデータ) (2023-08-21T15:58:47Z) - Weakly-supervised Fingerspelling Recognition in British Sign Language
Videos [85.61513254261523]
従来の指スペル認識法は、British Sign Language (BSL) に焦点を絞っていない
論文 参考訳(メタデータ) (2022-11-16T15:02:36Z) - Automatic dense annotation of large-vocabulary sign language videos [85.61513254261523]
論文 参考訳(メタデータ) (2022-08-04T17:55:09Z) - Scaling up sign spotting through sign language dictionaries [99.50956498009094]
この作業の焦点は、$textitsign spotting$ - 分離されたサインのビデオの場合、$textitwwhere$ と $textitwhere$ の識別が、連続的かつ協調的な手話ビデオで署名されている。
我々は,(1) $textitwatching$既存の映像を口コミでスムーズにラベル付けする,(2) $textitreading$ associated subtitles that provide additional translations of the signed content。
論文 参考訳(メタデータ) (2022-05-09T10:00:03Z) - Read and Attend: Temporal Localisation in Sign Language Videos [84.30262812057994]
論文 参考訳(メタデータ) (2021-03-30T16:39:53Z) - Watch, read and lookup: learning to spot signs from multiple supervisors [99.50956498009094]
論文 参考訳(メタデータ) (2020-10-08T14:12:56Z)