Related papers: ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs

ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs

URL: http://arxiv.org/abs/2406.18120v2
Date: Fri, 12 Jul 2024 18:22:26 GMT
Title: ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs
Authors: Ahmed Heakl, Youssef Zaghloul, Mennatullah Ali, Rania Hossam, Walid Gomaa,
Abstract summary: This paper explores the intricacies of machine translation (MT) and automatic speech recognition (ASR) systems. We focus on translating code-switched Egyptian Arabic-English to either English or Egyptian Arabic. We present the methodologies employed in developing these systems, utilizing large language models such as LLama and Gemma.
Score: 1.6381055567716192
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Motivated by the widespread increase in the phenomenon of code-switching between Egyptian Arabic and English in recent times, this paper explores the intricacies of machine translation (MT) and automatic speech recognition (ASR) systems, focusing on translating code-switched Egyptian Arabic-English to either English or Egyptian Arabic. Our goal is to present the methodologies employed in developing these systems, utilizing large language models such as LLama and Gemma. In the field of ASR, we explore the utilization of the Whisper model for code-switched Egyptian Arabic recognition, detailing our experimental procedures including data preprocessing and training techniques. Through the implementation of a consecutive speech-to-text translation system that integrates ASR with MT, we aim to overcome challenges posed by limited resources and the unique characteristics of the Egyptian Arabic dialect. Evaluation against established metrics showcases promising results, with our methodologies yielding a significant improvement of $56\%$ in English translation over the state-of-the-art and $9.3\%$ in Arabic translation. Since code-switching is deeply inherent in spoken languages, it is crucial that ASR systems can effectively handle this phenomenon. This capability is crucial for enabling seamless interaction in various domains, including business negotiations, cultural exchanges, and academic discourse. Our models and code are available as open-source resources. Code: \url{http://github.com/ahmedheakl/arazn-llm}}, Models: \url{http://huggingface.co/collections/ahmedheakl/arazn-llm-662ceaf12777656607b9524e}.

Related papers

dziribot: rag based intelligent conversational agent for algerian arabic dialect [0.0]
This paper introduces DziriBOT, a hybrid intelligent conversational agent specifically engineered to overcome these challenges.<n>We propose a multi-layered architecture that integrates specialized Natural Language Understanding (NLU) with Retrieval-Augmented Generation (RAG)<n>Our experimental results demonstrate that the fine-tuned DziriBERT model achieves state-of-the-art performance.
arXiv Detail & Related papers (2026-02-02T16:11:32Z)
MuDRiC: Multi-Dialect Reasoning for Arabic Commonsense Validation [30.670712065855902]
We introduce MuDRiC, an extended Arabic commonsense dataset incorporating multiple dialects, and (ii) a novel method adapting Graph Convolutional Networks (GCNs) to Arabic commonsense reasoning.<n>Our work enhances Arabic natural language understanding by providing both a foundational dataset and a novel method for handling its complex variations.
arXiv Detail & Related papers (2025-08-18T17:42:53Z)
Enhanced Arabic Text Retrieval with Attentive Relevance Scoring [12.053940320312355]
Arabic poses a particular challenge for natural language processing and information retrieval.<n>Despite the growing global significance of Arabic, it is still underrepresented in NLP research and benchmark resources.<n>We present an enhanced Dense Passage Retrieval framework developed specifically for Arabic.
arXiv Detail & Related papers (2025-07-31T10:18:28Z)
Advancing Arabic Reverse Dictionary Systems: A Transformer-Based Approach with Dataset Construction Guidelines [0.8944616102795021]
This study addresses the critical gap in Arabic natural language processing by developing an effective Arabic Reverse Dictionary (RD) system. We present a novel transformer-based approach with a semi-encoder neural network architecture featuring geometrically decreasing layers. Our methodology incorporates a comprehensive dataset construction process and establishes formal quality standards for Arabic lexicographic definitions.
arXiv Detail & Related papers (2025-04-30T09:56:36Z)
Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion [55.27025066199226]
This paper addresses the need for democratizing large language models (LLM) in the Arab world. One practical objective for an Arabic LLM is to utilize an Arabic-specific vocabulary for the tokenizer that could speed up decoding. Inspired by the vocabulary learning during Second Language (Arabic) Acquisition for humans, the released AraLLaMA employs progressive vocabulary expansion.
arXiv Detail & Related papers (2024-12-16T19:29:06Z)
Advanced Arabic Alphabet Sign Language Recognition Using Transfer Learning and Transformer Models [0.0]
This paper presents an Arabic Alphabet Sign Language recognition approach, using deep learning methods in conjunction with transfer learning and transformer-based models. We study the performance of the different variants on two publicly available datasets, namely ArSL2018 and AASL. Experimental results present evidence that the suggested methodology can receive a high recognition accuracy, by up to 99.6% and 99.43% on ArSL2018 and AASL, respectively.
arXiv Detail & Related papers (2024-10-01T13:39:26Z)
SCOPE: Sign Language Contextual Processing with Embedding from LLMs [49.5629738637893]
Sign languages, used by around 70 million Deaf individuals globally, are visual languages that convey visual and contextual information. Current methods in vision-based sign language recognition ( SLR) and translation (SLT) struggle with dialogue scenes due to limited dataset diversity and the neglect of contextually relevant information. We introduce SCOPE, a novel context-aware vision-based SLR and SLT framework.
arXiv Detail & Related papers (2024-09-02T08:56:12Z)
Exploring Retrieval Augmented Generation in Arabic [0.0]
Retrieval Augmented Generation (RAG) has emerged as a powerful technique in natural language processing. This paper presents a case study on the implementation and evaluation of RAG for Arabic text.
arXiv Detail & Related papers (2024-08-14T10:03:28Z)
ATHAR: A High-Quality and Diverse Dataset for Classical Arabic to English Translation [1.8109081066789847]
Classical Arabic represents a significant era, encompassing the golden age of Arab culture, philosophy, and scientific literature. We have identified a scarcity of translation datasets in Classical Arabic, which are often limited in scope and topics. We present the ATHAR dataset, comprising 66,000 high-quality Classical Arabic to English translation samples.
arXiv Detail & Related papers (2024-07-29T09:45:34Z)
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving [61.73180469072787]
We focus on the problem of spoken translation (ST) of code-switched speech in Indian languages to English text. We present a new end-to-end model architecture COSTA that scaffolds on pretrained automatic speech recognition (ASR) and machine translation (MT) modules. COSTA significantly outperforms many competitive cascaded and end-to-end multimodal baselines by up to 3.5 BLEU points.
arXiv Detail & Related papers (2024-06-16T16:10:51Z)
OSN-MDAD: Machine Translation Dataset for Arabic Multi-Dialectal Conversations on Online Social Media [5.2957928879391]
We propose an online social network-based multidialect Arabic dataset that is crafted by contextually translating English tweets into four Arabic dialects. Our results have shown a superior performance of our neural MT models trained using our dataset.
arXiv Detail & Related papers (2023-09-21T14:58:50Z)
AceGPT, Localizing Large Language Models in Arabic [73.39989503874634]
The paper proposes a comprehensive solution that includes pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic. The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities.
arXiv Detail & Related papers (2023-09-21T13:20:13Z)
On decoder-only architecture for speech-to-text and large language model integration [59.49886892602309]
Speech-LLaMA is a novel approach that effectively incorporates acoustic information into text-based large language models. We conduct experiments on multilingual speech-to-text translation tasks and demonstrate a significant improvement over strong baselines.
arXiv Detail & Related papers (2023-07-08T06:47:58Z)
Beyond Arabic: Software for Perso-Arabic Script Manipulation [67.31374614549237]
We provide a set of finite-state transducer (FST) components and corresponding utilities for manipulating the writing systems of languages that use the Perso-Arabic script. The library also provides simple FST-based romanization and transliteration.
arXiv Detail & Related papers (2023-01-26T20:37:03Z)
LAE: Language-Aware Encoder for Monolingual and Multilingual ASR [87.74794847245536]
A novel language-aware encoder (LAE) architecture is proposed to handle both situations by disentangling language-specific information. Experiments conducted on Mandarin-English code-switched speech suggest that the proposed LAE is capable of discriminating different languages in frame-level.
arXiv Detail & Related papers (2022-06-05T04:03:12Z)
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR [11.363966269198064]
We design a large multilingual end-to-end ASR using self-attention based conformer architecture. We trained the system using Arabic (Ar), English (En) and French (Fr) languages. Our findings demonstrate the strength of such a model by outperforming state-of-the-art monolingual dialectal Arabic and code-switching Arabic ASR.
arXiv Detail & Related papers (2021-05-31T08:20:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.