ChaLearn LAP Large Scale Signer Independent Isolated Sign Language
Recognition Challenge: Design, Results and Future Research
- URL: http://arxiv.org/abs/2105.05066v1
- Date: Tue, 11 May 2021 14:17:39 GMT
- Title: ChaLearn LAP Large Scale Signer Independent Isolated Sign Language
Recognition Challenge: Design, Results and Future Research
- Authors: Ozge Mercanoglu Sincan, Julio C. S. Jacques Junior, Sergio Escalera,
Hacer Yalim Keles
- Abstract summary: This work summarises the ChaLearn LAP Large Scale Signer Independent Isolated SLR Challenge, organised at CVPR 2021.
We discuss the challenge design, top winning solutions and suggestions for future research.
Winning teams achieved more than 96% recognition rate, and their approaches benefited from pose/hand/face estimation, transfer learning, external data, fusion/ensemble of modalities and different strategies to model-temporal information.
- Score: 28.949528008976493
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The performances of Sign Language Recognition (SLR) systems have improved
considerably in recent years. However, several open challenges still need to be
solved to allow SLR to be useful in practice. The research in the field is in
its infancy in regards to the robustness of the models to a large diversity of
signs and signers, and to fairness of the models to performers from different
demographics. This work summarises the ChaLearn LAP Large Scale Signer
Independent Isolated SLR Challenge, organised at CVPR 2021 with the goal of
overcoming some of the aforementioned challenges. We analyse and discuss the
challenge design, top winning solutions and suggestions for future research.
The challenge attracted 132 participants in the RGB track and 59 in the
RGB+Depth track, receiving more than 1.5K submissions in total. Participants
were evaluated using a new large-scale multi-modal Turkish Sign Language
(AUTSL) dataset, consisting of 226 sign labels and 36,302 isolated sign video
samples performed by 43 different signers. Winning teams achieved more than 96%
recognition rate, and their approaches benefited from pose/hand/face
estimation, transfer learning, external data, fusion/ensemble of modalities and
different strategies to model spatio-temporal information. However, methods
still fail to distinguish among very similar signs, in particular those sharing
similar hand trajectories.
Related papers
- Training Strategies for Isolated Sign Language Recognition [72.27323884094953]
This paper introduces a comprehensive model training pipeline for Isolated Sign Language Recognition.
The constructed pipeline incorporates carefully selected image and video augmentations to tackle the challenges of low data quality and varying sign speeds.
We achieve a state-of-the-art result on the WLASL and Slovo benchmarks with 1.63% and 14.12% improvements compared to the previous best solution.
arXiv Detail & Related papers (2024-12-16T08:37:58Z) - Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora [79.03392191805028]
The BabyLM Challenge is a community effort to close the data-efficiency gap between human and computational language learners.
Participants compete to optimize language model training on a fixed language data budget of 100 million words or less.
arXiv Detail & Related papers (2024-12-06T16:06:08Z) - Leveraging Contrastive Learning and Self-Training for Multimodal Emotion Recognition with Limited Labeled Samples [18.29910296652917]
We present our submission solutions for the Semi-Supervised Learning Sub-Challenge (MER2024-SEMI)
This challenge tackles the issue of limited annotated data in emotion recognition.
Our proposed method is validated to be effective on the MER2024-SEMI Challenge, achieving a weighted average F-score of 88.25% and ranking 6th on the leaderboard.
arXiv Detail & Related papers (2024-08-23T11:33:54Z) - A Transformer Model for Boundary Detection in Continuous Sign Language [55.05986614979846]
The Transformer model is employed for both Isolated Sign Language Recognition and Continuous Sign Language Recognition.
The training process involves using isolated sign videos, where hand keypoint features extracted from the input video are enriched.
The trained model, coupled with a post-processing method, is then applied to detect isolated sign boundaries within continuous sign videos.
arXiv Detail & Related papers (2024-02-22T17:25:01Z) - Towards the extraction of robust sign embeddings for low resource sign
language recognition [7.969704867355098]
We show that keypoint-based embeddings can transfer between sign languages and achieve competitive performance.
We furthermore achieve better performance using fine-tuned transferred embeddings than models trained only on the target sign language.
arXiv Detail & Related papers (2023-06-30T11:21:40Z) - MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised
Learning [90.17500229142755]
The first Multimodal Emotion Recognition Challenge (MER 2023) was successfully held at ACM Multimedia.
This paper introduces the motivation behind this challenge, describe the benchmark dataset, and provide some statistics about participants.
We believe this high-quality dataset can become a new benchmark in multimodal emotion recognition, especially for the Chinese research community.
arXiv Detail & Related papers (2023-04-18T13:23:42Z) - Word level Bangla Sign Language Dataset for Continuous BSL Recognition [0.0]
We develop an attention-based Bi-GRU model that captures the temporal dynamics of pose information for individuals communicating through sign language.
The accuracy of the model is reported to be 85.64%.
arXiv Detail & Related papers (2023-02-22T18:55:54Z) - NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision
Research [96.53307645791179]
We introduce the Never-Ending VIsual-classification Stream (NEVIS'22), a benchmark consisting of a stream of over 100 visual classification tasks.
Despite being limited to classification, the resulting stream has a rich diversity of tasks from OCR, to texture analysis, scene recognition, and so forth.
Overall, NEVIS'22 poses an unprecedented challenge for current sequential learning approaches due to the scale and diversity of tasks.
arXiv Detail & Related papers (2022-11-15T18:57:46Z) - Word separation in continuous sign language using isolated signs and
post-processing [47.436298331905775]
We propose a two-stage model for Continuous Sign Language Recognition.
In the first stage, the predictor model, which includes a combination of CNN, SVD, and LSTM, is trained with the isolated signs.
In the second stage, we apply a post-processing algorithm to the Softmax outputs obtained from the first part of the model.
arXiv Detail & Related papers (2022-04-02T18:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.