A Transformer Model for Boundary Detection in Continuous Sign Language
- URL: http://arxiv.org/abs/2402.14720v1
- Date: Thu, 22 Feb 2024 17:25:01 GMT
- Title: A Transformer Model for Boundary Detection in Continuous Sign Language
- Authors: Razieh Rastgoo, Kourosh Kiani, Sergio Escalera
- Abstract summary: The Transformer model is employed for both Isolated Sign Language Recognition and Continuous Sign Language Recognition.
The training process involves using isolated sign videos, where hand keypoint features extracted from the input video are enriched.
The trained model, coupled with a post-processing method, is then applied to detect isolated sign boundaries within continuous sign videos.
- Score: 55.05986614979846
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Sign Language Recognition (SLR) has garnered significant attention from
researchers in recent years, particularly the intricate domain of Continuous
Sign Language Recognition (CSLR), which presents heightened complexity compared
to Isolated Sign Language Recognition (ISLR). One of the prominent challenges
in CSLR pertains to accurately detecting the boundaries of isolated signs
within a continuous video stream. Additionally, the reliance on handcrafted
features in existing models poses a challenge to achieving optimal accuracy. To
surmount these challenges, we propose a novel approach utilizing a
Transformer-based model. Unlike traditional models, our approach focuses on
enhancing accuracy while eliminating the need for handcrafted features. The
Transformer model is employed for both ISLR and CSLR. The training process
involves using isolated sign videos, where hand keypoint features extracted
from the input video are enriched using the Transformer model. Subsequently,
these enriched features are forwarded to the final classification layer. The
trained model, coupled with a post-processing method, is then applied to detect
isolated sign boundaries within continuous sign videos. The evaluation of our
model is conducted on two distinct datasets, including both continuous signs
and their corresponding isolated signs, demonstrates promising results.
Related papers
- Continuous Sign Language Recognition with Adapted Conformer via Unsupervised Pretraining [0.6144680854063939]
State-of-the-art Conformer model for Speech Recognition is adapted for continuous sign language recognition.
This marks the first instance of employing Conformer for a vision-based task.
Unsupervised pretraining is conducted on a curated sign language dataset.
arXiv Detail & Related papers (2024-05-20T13:40:52Z) - FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition [50.064502884594376]
We study the problem of human action recognition using motion capture (MoCap) sequences.
We propose a novel Spatial-Temporal Mesh Transformer (STMT) to directly model the mesh sequences.
The proposed method achieves state-of-the-art performance compared to skeleton-based and point-cloud-based models.
arXiv Detail & Related papers (2023-03-31T16:19:27Z) - A Transformer-Based Contrastive Learning Approach for Few-Shot Sign
Language Recognition [0.0]
We propose a novel Contrastive Transformer-based model, which demonstrate to learn rich representations from body key points sequences.
Experiments showed that the model could generalize well and achieved competitive results for sign classes never seen in the training process.
arXiv Detail & Related papers (2022-04-05T11:42:55Z) - Word separation in continuous sign language using isolated signs and
post-processing [47.436298331905775]
We propose a two-stage model for Continuous Sign Language Recognition.
In the first stage, the predictor model, which includes a combination of CNN, SVD, and LSTM, is trained with the isolated signs.
In the second stage, we apply a post-processing algorithm to the Softmax outputs obtained from the first part of the model.
arXiv Detail & Related papers (2022-04-02T18:34:33Z) - Multi-Modal Zero-Shot Sign Language Recognition [51.07720650677784]
We propose a multi-modal Zero-Shot Sign Language Recognition model.
A Transformer-based model along with a C3D model is used for hand detection and deep features extraction.
A semantic space is used to map the visual features to the lingual embedding of the class labels.
arXiv Detail & Related papers (2021-09-02T09:10:39Z) - Continuous 3D Multi-Channel Sign Language Production via Progressive
Transformers and Mixture Density Networks [37.679114155300084]
Sign Language Production (SLP) must embody both the continuous articulation and full morphology of sign to be truly understandable by the Deaf community.
We propose a novel Progressive Transformer architecture, the first SLP model to translate from spoken language sentences to continuous 3D sign pose sequences.
We present extensive data augmentation techniques to reduce prediction drift, alongside an adversarial training regime and a Mixture Density Network (MDN) formulation to produce realistic and expressive sign pose sequences.
arXiv Detail & Related papers (2021-03-11T22:11:17Z) - A Novel Anomaly Detection Algorithm for Hybrid Production Systems based
on Deep Learning and Timed Automata [73.38551379469533]
DAD:DeepAnomalyDetection is a new approach for automatic model learning and anomaly detection in hybrid production systems.
It combines deep learning and timed automata for creating behavioral model from observations.
The algorithm has been applied to few data sets including two from real systems and has shown promising results.
arXiv Detail & Related papers (2020-10-29T08:27:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.