The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing
Multilingual ASR
- URL: http://arxiv.org/abs/2305.19584v1
- Date: Wed, 31 May 2023 06:09:11 GMT
- Title: The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing
Multilingual ASR
- Authors: Kaousheik Jayakumar, Vrunda N. Sukhadia, A Arunkumar, S. Umesh
- Abstract summary: Building a multilingual Automated Speech Recognition system in a linguistically diverse country like India can be a challenging task.
This problem can be solved by exploiting the fact that many of these languages are phonetically similar.
New approaches are explored and compared to improve the performance of CLS based multilingual ASR model.
- Score: 0.2676349883103404
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Building a multilingual Automated Speech Recognition (ASR) system in a
linguistically diverse country like India can be a challenging task due to the
differences in scripts and the limited availability of speech data. This
problem can be solved by exploiting the fact that many of these languages are
phonetically similar. These languages can be converted into a Common Label Set
(CLS) by mapping similar sounds to common labels. In this paper, new approaches
are explored and compared to improve the performance of CLS based multilingual
ASR model. Specific language information is infused in the ASR model by giving
Language ID or using CLS to Native script converter on top of the CLS
Multilingual model. These methods give a significant improvement in Word Error
Rate (WER) compared to the CLS baseline. These methods are further tried on
out-of-distribution data to check their robustness.
Related papers
- Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.
Currently, instruction-tuned large language models (LLMs) excel at various English tasks.
Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - Unified model for code-switching speech recognition and language
identification based on a concatenated tokenizer [17.700515986659063]
Code-Switching (CS) multilingual Automatic Speech Recognition (ASR) models can transcribe speech containing two or more alternating languages during a conversation.
This paper proposes a new method for creating code-switching ASR datasets from purely monolingual data sources.
A novel Concatenated Tokenizer enables ASR models to generate language ID for each emitted text token while reusing existing monolingual tokenizers.
arXiv Detail & Related papers (2023-06-14T21:24:11Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - DuDe: Dual-Decoder Multilingual ASR for Indian Languages using Common
Label Set [0.0]
Common Label Set ( CLS) maps graphemes of various languages with similar sounds to common labels.
Since Indian languages are mostly phonetic, building a transliteration to convert from native script to CLS is easy.
We propose a novel architecture called Multilingual-Decoder-Decoder for building multilingual systems.
arXiv Detail & Related papers (2022-10-30T04:01:26Z) - Learning ASR pathways: A sparse multilingual ASR model [31.147484652643282]
We present ASR pathways, a sparse multilingual ASR model that activates language-specific sub-networks ("pathways")
With the overlapping sub-networks, the shared parameters can also enable knowledge transfer for lower-resource languages via joint multilingual training.
Our proposed ASR pathways outperform both dense models and a language-agnostically pruned model, and provide better performance on low-resource languages.
arXiv Detail & Related papers (2022-09-13T05:14:08Z) - LAE: Language-Aware Encoder for Monolingual and Multilingual ASR [87.74794847245536]
A novel language-aware encoder (LAE) architecture is proposed to handle both situations by disentangling language-specific information.
Experiments conducted on Mandarin-English code-switched speech suggest that the proposed LAE is capable of discriminating different languages in frame-level.
arXiv Detail & Related papers (2022-06-05T04:03:12Z) - Multi-level Contrastive Learning for Cross-lingual Spoken Language
Understanding [90.87454350016121]
We develop novel code-switching schemes to generate hard negative examples for contrastive learning at all levels.
We develop a label-aware joint model to leverage label semantics for cross-lingual knowledge transfer.
arXiv Detail & Related papers (2022-05-07T13:44:28Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - GLUECoS : An Evaluation Benchmark for Code-Switched NLP [17.066725832825423]
We present an evaluation benchmark, GLUECoS, for code-switched languages.
We present results on several NLP tasks in English-Hindi and English-Spanish.
We fine-tune multilingual models on artificially generated code-switched data.
arXiv Detail & Related papers (2020-04-26T13:28:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.