GLaM-Sign: Greek Language Multimodal Lip Reading with Integrated Sign Language Accessibility
- URL: http://arxiv.org/abs/2501.05213v1
- Date: Thu, 09 Jan 2025 13:06:47 GMT
- Title: GLaM-Sign: Greek Language Multimodal Lip Reading with Integrated Sign Language Accessibility
- Authors: Dimitris Kouremenos, Klimis Ntalianis,
- Abstract summary: This dataset underscores the transformative potential of multimodal resources in bridging communication gaps.<n>It is a groundbreaking resource in accessibility and multimodal AI, designed to support Deaf and Hard-of-Hearing (DHH) individuals.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The Greek Language Multimodal Lip Reading with Integrated Sign Language Accessibility (GLaM-Sign) [1] is a groundbreaking resource in accessibility and multimodal AI, designed to support Deaf and Hard-of-Hearing (DHH) individuals. Developed from the FEELIT project [2], it integrates high-resolution audio, video, textual transcriptions, and Greek Sign Language translations for applications like real-time sign language translation and enhanced subtitle synchronization. While its primary focus is on promoting inclusivity in the Greek tourism sector, its adaptability extends to education, healthcare, and public services. Future advancements will enhance word-level precision and scalability to additional languages, supported by advanced AI methodologies and collaborations with diverse stakeholders. This dataset underscores the transformative potential of multimodal resources in bridging communication gaps, fostering innovation, and setting a benchmark for ethical AI and inclusive technologies.
Related papers
- Analyzing and Improving Cross-lingual Knowledge Transfer for Machine Translation [5.878901309908815]
We study cross-lingual knowledge transfer in neural models and develop methods to improve robustness and generalization in multilingual settings.<n>We examine the role of language diversity during training and show that increasing translation coverage improves generalization and reduces off-target behavior.
arXiv Detail & Related papers (2026-01-07T15:51:54Z) - Opportunities and Challenges of Natural Language Processing for Low-Resource Senegalese Languages in Social Science Research [0.6016863427924156]
This paper provides the first comprehensive overview of progress and challenges for the six national languages officially recognized by the Senegalese Constitution: Wolof, Pulaar, Sereer, Joola, Mandingue, and Soninke.<n>We synthesize linguistic, sociotechnical, and infrastructural factors that shape their digital readiness and identify gaps in data, tools, and benchmarks.<n>The paper concludes by outlining a roadmap toward sustainable, community-centered NLP ecosystems for Senegalese languages.
arXiv Detail & Related papers (2025-12-24T20:20:31Z) - PART: Progressive Alignment Representation Training for Multilingual Speech-To-Text with LLMs [58.2469845374385]
We introduce Progressive Alignment Representation Training (PART)<n>PART is a multi-stage and multi-task framework that separates within-language from cross-language alignment.<n>Experiments on CommonVoice 15, Fleurs, Wenetspeech, and CoVoST2 show that PART surpasses conventional approaches.
arXiv Detail & Related papers (2025-09-24T03:54:14Z) - Towards Inclusive Communication: A Unified Framework for Generating Spoken Language from Sign, Lip, and Audio [52.859261069569165]
We propose the first unified framework capable of handling diverse combinations of sign language, lip movements, and audio for spoken-language text generation.<n>We focus on three main objectives: (i) designing a unified, modality-agnostic architecture capable of effectively processing heterogeneous inputs; (ii) exploring the underexamined synergy among modalities, particularly the role of lip movements as non-manual cues in sign language comprehension; and (iii) achieving performance on par with or better than state-of-the-art models specialized for individual tasks.
arXiv Detail & Related papers (2025-08-28T06:51:42Z) - Preserving Cultural Identity with Context-Aware Translation Through Multi-Agent AI Systems [0.4218593777811082]
Language is a cornerstone of cultural identity, yet globalization and the dominance of major languages have placed nearly 3,000 languages at risk of extinction.
Existing AI-driven translation models prioritize efficiency but often fail to capture cultural nuances, idiomatic expressions, and historical significance.
We propose a multi-agent AI framework designed for culturally adaptive translation in underserved language communities.
arXiv Detail & Related papers (2025-03-05T06:43:59Z) - Exploring Cross-lingual Latent Transplantation: Mutual Opportunities and Open Challenges [48.96952594416528]
Current large language models (LLMs) often exhibit imbalances in multilingual capabilities and cultural adaptability.<n>XTransplant framework enables models to harness the complementary strengths of both English and non-English resources by transplanting latent activations across languages.
arXiv Detail & Related papers (2024-12-17T09:05:30Z) - Real-Time Multilingual Sign Language Processing [4.626189039960495]
Sign Language Processing (SLP) is an interdisciplinary field comprised of Natural Language Processing (NLP) and Computer Vision.<n>Traditional approaches have often been constrained by the use of gloss-based systems that are both language-specific and inadequate for capturing the multidimensional nature of sign language.<n>We propose the use of SignWiring, a universal sign language transcription notation system, to serve as an intermediary link between the visual-gestural modality of signed languages and text-based linguistic representations.
arXiv Detail & Related papers (2024-12-02T21:51:41Z) - ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Multilingual Contrastive Framework [78.07201802874529]
ShifCon is a Shift-based multilingual Contrastive framework that aligns the internal forward process of other languages toward that of the dominant one.<n>Experiments demonstrate that our ShifCon framework significantly enhances the performance of non-dominant languages.
arXiv Detail & Related papers (2024-10-25T10:28:59Z) - Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
Lens is a novel approach to enhance multilingual capabilities of large language models (LLMs)
It operates by manipulating the hidden representations within the language-agnostic and language-specific subspaces from top layers of LLMs.
It achieves superior results with much fewer computational resources compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z) - Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection [49.27067541740956]
Speech Emotion Recognition (SER) is a crucial component in developing general-purpose AI agents capable of natural human-computer interaction.<n>Building robust multilingual SER systems remains challenging due to the scarcity of labeled data in languages other than English and Chinese.<n>We propose an approach to enhance SER performance in low SER resource languages by leveraging data from high-resource languages.
arXiv Detail & Related papers (2024-09-17T08:36:45Z) - Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features [18.76505158652759]
We propose to exploit both semantic and linguistic features between multiple languages to enhance multilingual translation.
On the encoder side, we introduce a disentangling learning task that aligns encoder representations by disentangling semantic and linguistic features.
On the decoder side, we leverage a linguistic encoder to integrate low-level linguistic features to assist in the target language generation.
arXiv Detail & Related papers (2024-08-02T17:10:12Z) - A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers [51.8203871494146]
The rapid development of Large Language Models (LLMs) demonstrates remarkable multilingual capabilities in natural language processing.<n>Despite the breakthroughs of LLMs, the investigation into the multilingual scenario remains insufficient.<n>This survey aims to help the research community address multilingual problems and provide a comprehensive understanding of the core concepts, key techniques, and latest developments in multilingual natural language processing based on LLMs.
arXiv Detail & Related papers (2024-05-17T17:47:39Z) - SUTRA: Scalable Multilingual Language Model Architecture [5.771289785515227]
We introduce SUTRA, a multilingual Large Language Model architecture capable of understanding, reasoning, and generating text in over 50 languages.
Through extensive evaluations, SUTRA is demonstrated to surpass existing models like GPT-3.5, Llama2 by 20-30% on leading Massive Multitask Language Understanding (MMLU) benchmarks.
Our findings suggest that SUTRA not only fills pivotal gaps in multilingual model capabilities but also establishes a new benchmark for operational efficiency and scalability in AI applications.
arXiv Detail & Related papers (2024-05-07T20:11:44Z) - From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation [0.0]
generative large language models (LLMs) stand at the forefront of innovation, showcasing unparalleled abilities in text understanding and generation.
However, the limited representation of low-resource languages like Ukrainian poses a notable challenge, restricting the reach and relevance of this technology.
Our paper addresses this by fine-tuning the open-source Gemma and Mistral LLMs with Ukrainian datasets, aiming to improve their linguistic proficiency.
arXiv Detail & Related papers (2024-04-14T04:25:41Z) - Language Detection for Transliterated Content [0.0]
We study the widespread use of transliteration, where the English alphabet is employed to convey messages in native languages.
This paper addresses this challenge through a dataset of phone text messages in Hindi and Russian transliterated into English.
The research pioneers innovative approaches to identify and convert transliterated text.
arXiv Detail & Related papers (2024-01-09T15:40:54Z) - Towards a Deep Understanding of Multilingual End-to-End Speech
Translation [52.26739715012842]
We analyze representations learnt in a multilingual end-to-end speech translation model trained over 22 languages.
We derive three major findings from our analysis.
arXiv Detail & Related papers (2023-10-31T13:50:55Z) - Revamping Multilingual Agreement Bidirectionally via Switched
Back-translation for Multilingual Neural Machine Translation [107.83158521848372]
multilingual agreement (MA) has shown its importance for multilingual neural machine translation (MNMT)
We present textbfBidirectional textbfMultilingual textbfAgreement via textbfSwitched textbfBack-textbftranslation (textbfBMA-SBT)
It is a novel and universal multilingual agreement framework for fine-tuning pre-trained MNMT models.
arXiv Detail & Related papers (2022-09-28T09:14:58Z) - Generalizing Multimodal Pre-training into Multilingual via Language
Acquisition [54.69707237195554]
English-based Vision-Language Pre-training has achieved great success in various downstream tasks.
Some efforts have been taken to generalize this success to non-English languages through Multilingual Vision-Language Pre-training.
We propose a textbfMultitextbfLingual textbfAcquisition (MLA) framework that can easily generalize a monolingual Vision-Language Pre-training model into multilingual.
arXiv Detail & Related papers (2022-05-29T08:53:22Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.