DuDe: Dual-Decoder Multilingual ASR for Indian Languages using Common
Label Set
- URL: http://arxiv.org/abs/2210.16739v1
- Date: Sun, 30 Oct 2022 04:01:26 GMT
- Title: DuDe: Dual-Decoder Multilingual ASR for Indian Languages using Common
Label Set
- Authors: Arunkumar A, Mudit Batra, Umesh S
- Abstract summary: Common Label Set ( CLS) maps graphemes of various languages with similar sounds to common labels.
Since Indian languages are mostly phonetic, building a transliteration to convert from native script to CLS is easy.
We propose a novel architecture called Multilingual-Decoder-Decoder for building multilingual systems.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a multilingual country like India, multilingual Automatic Speech
Recognition (ASR) systems have much scope. Multilingual ASR systems exhibit
many advantages like scalability, maintainability, and improved performance
over the monolingual ASR systems. However, building multilingual systems for
Indian languages is challenging since different languages use different scripts
for writing. On the other hand, Indian languages share a lot of common sounds.
Common Label Set (CLS) exploits this idea and maps graphemes of various
languages with similar sounds to common labels. Since Indian languages are
mostly phonetic, building a parser to convert from native script to CLS is
easy. In this paper, we explore various approaches to build multilingual ASR
models. We also propose a novel architecture called Encoder-Decoder-Decoder for
building multilingual systems that use both CLS and native script labels. We
also analyzed the effectiveness of CLS-based multilingual systems combined with
machine transliteration.
Related papers
- The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing
Multilingual ASR [0.2676349883103404]
Building a multilingual Automated Speech Recognition system in a linguistically diverse country like India can be a challenging task.
This problem can be solved by exploiting the fact that many of these languages are phonetically similar.
New approaches are explored and compared to improve the performance of CLS based multilingual ASR model.
arXiv Detail & Related papers (2023-05-31T06:09:11Z) - LAE: Language-Aware Encoder for Monolingual and Multilingual ASR [87.74794847245536]
A novel language-aware encoder (LAE) architecture is proposed to handle both situations by disentangling language-specific information.
Experiments conducted on Mandarin-English code-switched speech suggest that the proposed LAE is capable of discriminating different languages in frame-level.
arXiv Detail & Related papers (2022-06-05T04:03:12Z) - Code Switched and Code Mixed Speech Recognition for Indic languages [0.0]
Training multilingual automatic speech recognition (ASR) systems is challenging because acoustic and lexical information is typically language specific.
We compare the performance of end to end multilingual speech recognition system to the performance of monolingual models conditioned on language identification (LID)
We also propose a similar technique to solve the Code Switched problem and achieve a WER of 21.77 and 28.27 over Hindi-English and Bengali-English respectively.
arXiv Detail & Related papers (2022-03-30T18:09:28Z) - Discovering Phonetic Inventories with Crosslingual Automatic Speech
Recognition [71.49308685090324]
This paper investigates the influence of different factors (i.e., model architecture, phonotactic model, type of speech representation) on phone recognition in an unknown language.
We find that unique sounds, similar sounds, and tone languages remain a major challenge for phonetic inventory discovery.
arXiv Detail & Related papers (2022-01-26T22:12:55Z) - Dual Script E2E framework for Multilingual and Code-Switching ASR [4.697788649564087]
We train multilingual and code-switching ASR systems for Indian languages.
Inspired by results in text-to-speech synthesis, we use an in-house rule-based common label set ( CLS) representation.
We show our results on the multilingual and code-switching tasks of the Indic ASR Challenge 2021.
arXiv Detail & Related papers (2021-06-02T18:08:27Z) - Multilingual and code-switching ASR challenges for low resource Indian
languages [59.2906853285309]
We focus on building multilingual and code-switching ASR systems through two different subtasks related to a total of seven Indian languages.
We provide a total of 600 hours of transcribed speech data, comprising train and test sets, in these languages.
We also provide a baseline recipe for both the tasks with a WER of 30.73% and 32.45% on the test sets of multilingual and code-switching subtasks, respectively.
arXiv Detail & Related papers (2021-04-01T03:37:01Z) - Acoustics Based Intent Recognition Using Discovered Phonetic Units for
Low Resource Languages [51.0542215642794]
We propose a novel acoustics based intent recognition system that uses discovered phonetic units for intent classification.
We present results for two languages families - Indic languages and Romance languages, for two different intent recognition tasks.
arXiv Detail & Related papers (2020-11-07T00:35:31Z) - How Phonotactics Affect Multilingual and Zero-shot ASR Performance [74.70048598292583]
A Transformer encoder-decoder model has been shown to leverage multilingual data well in IPA transcriptions of languages presented during training.
We replace the encoder-decoder with a hybrid ASR system consisting of a separate AM and LM.
We show that the gain from modeling crosslingual phonotactics is limited, and imposing a too strong model can hurt the zero-shot transfer.
arXiv Detail & Related papers (2020-10-22T23:07:24Z) - That Sounds Familiar: an Analysis of Phonetic Representations Transfer
Across Languages [72.9927937955371]
We use the resources existing in other languages to train a multilingual automatic speech recognition model.
We observe significant improvements across all languages in the multilingual setting, and stark degradation in the crosslingual setting.
Our analysis uncovered that even the phones that are unique to a single language can benefit greatly from adding training data from other languages.
arXiv Detail & Related papers (2020-05-16T22:28:09Z) - Language-agnostic Multilingual Modeling [23.06484126933893]
We build a language-agnostic multilingual ASR system which transforms all languages to one writing system through a many-to-one transliteration transducer.
We show with four Indic languages, namely, Hindi, Bengali, Tamil and Kannada, that the language-agnostic multilingual model achieves up to 10% relative reduction in Word Error Rate (WER) over a language-dependent multilingual model.
arXiv Detail & Related papers (2020-04-20T18:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.