Related papers: A Comparative Study of Continuous Sign Language Recognition Techniques

Related papers

Meaningful Pose-Based Sign Language Evaluation [29.030154300749086]
The study covers keypoint distance-based, embedding-based, and back-translation-based metrics.<n>We show tradeoffs between different metrics in different scenarios through automatic meta-evaluation of sign-level retrieval and a human correlation study of text-to-pose translation across different sign languages.
arXiv Detail & Related papers (2025-10-08T19:00:24Z)
An Empirical Analysis of Discrete Unit Representations in Speech Language Modeling Pre-training [8.613149007067143]
We systematically examine how model architecture, data representation, and training robustness influence the pre-training stage.<n>By examining cluster distribution and phonemic alignments, we investigate the effective use of discrete vocabulary.
arXiv Detail & Related papers (2025-09-03T18:11:53Z)
Benchmarking Prosody Encoding in Discrete Speech Tokens [13.60092490447892]
This study focuses on prosodic encoding based on their sensitivity to the artificially modified prosody, aiming to provide practical guidelines for designing discrete tokens.<n>In particular, speech language models are expected to understand and generate responses that reflect not only the semantic content but also prosodic features.
arXiv Detail & Related papers (2025-08-15T05:11:16Z)
Handling Symbolic Language in Student Texts: A Comparative Study of NLP Embedding Models [0.0]
This study explores how contemporary embedding models differ in their capability to process and interpret science-related symbolic expressions.<n>Our findings reveal significant differences in model performance, with OpenAI's GPT-text-embedding-3-large outperforming all other examined models.
arXiv Detail & Related papers (2025-05-23T14:26:33Z)
Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator [55.94334001112357]
We introduce a multilingual sign language model, Signs as Tokens (SOKE), which can generate 3D sign avatars autoregressively from text inputs. We propose a retrieval-enhanced SLG approach, which incorporates external sign dictionaries to provide accurate word-level signs.
arXiv Detail & Related papers (2024-11-26T18:28:09Z)
SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction [65.1590372072555]
SHuBERT (Sign Hidden-Unit BERT) is a self-supervised contextual representation model learned from 1,000 hours of American Sign Language video.<n>SHuBERT adapts masked token prediction objectives to multi-stream visual sign language input, learning to predict multiple targets corresponding to clustered hand, face, and body pose streams.<n>SHuBERT achieves state-of-the-art performance across multiple tasks including sign language translation, isolated sign language recognition, and fingerspelling detection.
arXiv Detail & Related papers (2024-11-25T03:13:08Z)
Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback [50.84142264245052]
This work introduces the Align-SLM framework to enhance the semantic understanding of textless Spoken Language Models (SLMs) Our approach generates multiple speech continuations from a given prompt and uses semantic metrics to create preference data for Direct Preference Optimization (DPO) We evaluate the framework using ZeroSpeech 2021 benchmarks for lexical and syntactic modeling, the spoken version of the StoryCloze dataset for semantic coherence, and other speech generation metrics, including the GPT4-o score and human evaluation.
arXiv Detail & Related papers (2024-11-04T06:07:53Z)
MACT: Model-Agnostic Cross-Lingual Training for Discourse Representation Structure Parsing [4.536003573070846]
We introduce a cross-lingual training strategy for semantic representation parsing models. It exploits the alignments between languages encoded in pre-trained language models. Experiments show significant improvements in DRS clause and graph parsing in English, German, Italian and Dutch.
arXiv Detail & Related papers (2024-06-03T07:02:57Z)
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors [9.279391026742658]
We analyze the effect of model size, training objectives, and model architecture on the models' performance as a feature extractor. We develop a novel metric, the Phonetic-Syntax Ratio (PSR), to measure the phonetic and synthetic information in the extracted representations.
arXiv Detail & Related papers (2023-11-27T15:58:28Z)
Classification of Phonological Parameters in Sign Languages [0.0]
Linguistic research often breaks down signs into constituent parts to study sign languages. We show how a single model can be used to recognise the individual phonological parameters within sign languages.
arXiv Detail & Related papers (2022-05-24T13:40:45Z)
Modeling Intensification for Sign Language Generation: A Computational Approach [13.57903290481737]
End-to-end sign language generation models do not accurately represent the prosody in sign language. We aim to improve the prosody in generated sign languages by modeling intensification in a data-driven manner. We find that our efforts in intensification modeling yield better results when evaluated with automatic metrics.
arXiv Detail & Related papers (2022-03-18T01:13:21Z)
Interpreting Language Models Through Knowledge Graph Extraction [42.97929497661778]
We compare BERT-based language models through snapshots of acquired knowledge at sequential stages of the training process. We present a methodology to unveil a knowledge acquisition timeline by generating knowledge graph extracts from cloze "fill-in-the-blank" statements. We extend this analysis to a comparison of pretrained variations of BERT models (DistilBERT, BERT-base, RoBERTa)
arXiv Detail & Related papers (2021-11-16T15:18:01Z)
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition [98.70304981174748]
We focus on the general applications of pretrained speech representations, on advanced end-to-end automatic speech recognition (E2E-ASR) models. We select several pretrained speech representations and present the experimental results on various open-source and publicly available corpora for E2E-ASR.
arXiv Detail & Related papers (2021-10-09T15:06:09Z)
Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task. The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them. By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
Cross-lingual Spoken Language Understanding with Regularized Representation Alignment [71.53159402053392]
We propose a regularization approach to align word-level and sentence-level representations across languages without any external resource. Experiments on the cross-lingual spoken language understanding task show that our model outperforms current state-of-the-art methods in both few-shot and zero-shot scenarios.
arXiv Detail & Related papers (2020-09-30T08:56:53Z)
Learning Universal Representations from Word to Sentence [89.82415322763475]
This work introduces and explores the universal representation learning, i.e., embeddings of different levels of linguistic unit in a uniform vector space. We present our approach of constructing analogy datasets in terms of words, phrases and sentences. We empirically verify that well pre-trained Transformer models incorporated with appropriate training settings may effectively yield universal representation.
arXiv Detail & Related papers (2020-09-10T03:53:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.