Exploring Teacher-Student Learning Approach for Multi-lingual
Speech-to-Intent Classification
- URL: http://arxiv.org/abs/2109.13486v1
- Date: Tue, 28 Sep 2021 04:43:11 GMT
- Title: Exploring Teacher-Student Learning Approach for Multi-lingual
Speech-to-Intent Classification
- Authors: Bidisha Sharma, Maulik Madhavi, Xuehao Zhou, Haizhou Li
- Abstract summary: We develop an end-to-end system that supports multiple languages.
We exploit knowledge from a pre-trained multi-lingual natural language processing model.
- Score: 73.5497360800395
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: End-to-end speech-to-intent classification has shown its advantage in
harvesting information from both text and speech. In this paper, we study a
technique to develop such an end-to-end system that supports multiple
languages. To overcome the scarcity of multi-lingual speech corpus, we exploit
knowledge from a pre-trained multi-lingual natural language processing model.
Multi-lingual bidirectional encoder representations from transformers (mBERT)
models are trained on multiple languages and hence expected to perform well in
the multi-lingual scenario. In this work, we employ a teacher-student learning
approach to sufficiently extract information from an mBERT model to train a
multi-lingual speech model. In particular, we use synthesized speech generated
from an English-Mandarin text corpus for analysis and training of a
multi-lingual intent classification model. We also demonstrate that the
teacher-student learning approach obtains an improved performance (91.02%) over
the traditional end-to-end (89.40%) intent classification approach in a
practical multi-lingual scenario.
Related papers
- Hindi as a Second Language: Improving Visually Grounded Speech with
Semantically Similar Samples [89.16814518860357]
The objective of this work is to explore the learning of visually grounded speech models (VGS) from multilingual perspective.
Our key contribution in this work is to leverage the power of a high-resource language in a bilingual visually grounded speech model to improve the performance of a low-resource language.
arXiv Detail & Related papers (2023-03-30T16:34:10Z) - Learning Cross-lingual Visual Speech Representations [108.68531445641769]
Cross-lingual self-supervised visual representation learning has been a growing research topic in the last few years.
We use the recently-proposed Raw Audio-Visual Speechs (RAVEn) framework to pre-train an audio-visual model with unlabelled data.
Our experiments show that: (1) multi-lingual models with more data outperform monolingual ones, but, when keeping the amount of data fixed, monolingual models tend to reach better performance.
arXiv Detail & Related papers (2023-03-14T17:05:08Z) - Adapting Multilingual Speech Representation Model for a New,
Underresourced Language through Multilingual Fine-tuning and Continued
Pretraining [2.3513645401551333]
We investigate the possibility for adapting an existing multilingual wav2vec 2.0 model for a new language.
Our results show that continued pretraining is the most effective method to adapt a wav2vec 2.0 model for a new language.
We find that if a model pretrained on a related speech variety or an unrelated language with similar phonological characteristics is available, multilingual fine-tuning using additional data from that language can have positive impact on speech recognition performance.
arXiv Detail & Related papers (2023-01-18T03:57:53Z) - Distilling a Pretrained Language Model to a Multilingual ASR Model [3.4012007729454816]
We distill the rich knowledge embedded inside a well-trained teacher text model to the student speech model.
We show the superiority of our method on 20 low-resource languages of the CommonVoice dataset with less than 100 hours of speech data.
arXiv Detail & Related papers (2022-06-25T12:36:11Z) - Generalizing Multimodal Pre-training into Multilingual via Language
Acquisition [54.69707237195554]
English-based Vision-Language Pre-training has achieved great success in various downstream tasks.
Some efforts have been taken to generalize this success to non-English languages through Multilingual Vision-Language Pre-training.
We propose a textbfMultitextbfLingual textbfAcquisition (MLA) framework that can easily generalize a monolingual Vision-Language Pre-training model into multilingual.
arXiv Detail & Related papers (2022-05-29T08:53:22Z) - Towards Developing a Multilingual and Code-Mixed Visual Question
Answering System by Knowledge Distillation [20.33235443471006]
We propose a knowledge distillation approach to extend an English language-vision model (teacher) into an equally effective multilingual and code-mixed model (student)
We also create the large-scale multilingual and code-mixed VQA dataset in eleven different language setups.
Experimental results and in-depth analysis show the effectiveness of the proposed VQA model over the pre-trained language-vision models on eleven diverse language setups.
arXiv Detail & Related papers (2021-09-10T03:47:29Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - A Multilingual Modeling Method for Span-Extraction Reading Comprehension [2.4905424368103444]
We propose a multilingual extractive reading comprehension approach called XLRC.
We show that our model outperforms the state-of-the-art baseline (i.e., RoBERTa_Large) on the CMRC 2018 task.
arXiv Detail & Related papers (2021-05-31T11:05:30Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.