Transfer learning from High-Resource to Low-Resource Language Improves
Speech Affect Recognition Classification Accuracy
- URL: http://arxiv.org/abs/2103.11764v1
- Date: Thu, 4 Mar 2021 08:17:19 GMT
- Title: Transfer learning from High-Resource to Low-Resource Language Improves
Speech Affect Recognition Classification Accuracy
- Authors: Sara Durrani and Umair Arshad
- Abstract summary: We present an approach in which the model is trained on high resource language and fine-tune to recognize affects in low resource language.
We train the model in same corpus setting on SAVEE, EMOVO, Urdu, and IEMOCAP by achieving baseline accuracy of 60.45, 68.05, 80.34, and 56.58 percent respectively.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Speech Affect Recognition is a problem of extracting emotional affects from
audio data. Low resource languages corpora are rear and affect recognition is a
difficult task in cross-corpus settings. We present an approach in which the
model is trained on high resource language and fine-tune to recognize affects
in low resource language. We train the model in same corpus setting on SAVEE,
EMOVO, Urdu, and IEMOCAP by achieving baseline accuracy of 60.45, 68.05, 80.34,
and 56.58 percent respectively. For capturing the diversity of affects in
languages cross-corpus evaluations are discussed in detail. We find that
accuracy improves by adding the domain target data into the training data.
Finally, we show that performance is improved for low resource language speech
affect recognition by achieving the UAR OF 69.32 and 68.2 for Urdu and Italian
speech affects.
Related papers
- CLARA: Multilingual Contrastive Learning for Audio Representation
Acquisition [5.520654376217889]
CLARA minimizes reliance on labelled data, enhancing generalization across languages.
Our approach adeptly captures emotional nuances in speech, overcoming subjective assessment issues.
It adapts to low-resource languages, marking progress in multilingual speech representation learning.
arXiv Detail & Related papers (2023-10-18T09:31:56Z) - Adversarial Training For Low-Resource Disfluency Correction [50.51901599433536]
We propose an adversarially-trained sequence-tagging model for Disfluency Correction (DC)
We show the benefit of our proposed technique, which crucially depends on synthetically generated disfluent data, by evaluating it for DC in three Indian languages.
Our technique also performs well in removing stuttering disfluencies in ASR transcripts introduced by speech impairments.
arXiv Detail & Related papers (2023-06-10T08:58:53Z) - Data Augmentation for Speech Recognition in Maltese: A Low-Resource
Perspective [4.6898263272139795]
We consider data augmentation techniques for improving speech recognition in Maltese.
We consider three types of data augmentation: unsupervised training, multilingual training and the use of synthesized speech as training data.
Our results show that combining the three data augmentation techniques studied here lead us to an absolute WER improvement of 15% without the use of a language model.
arXiv Detail & Related papers (2021-11-15T14:28:21Z) - Cross-lingual Transfer for Speech Processing using Acoustic Language
Similarity [81.51206991542242]
Cross-lingual transfer offers a compelling way to help bridge this digital divide.
Current cross-lingual algorithms have shown success in text-based tasks and speech-related tasks over some low-resource languages.
We propose a language similarity approach that can efficiently identify acoustic cross-lingual transfer pairs across hundreds of languages.
arXiv Detail & Related papers (2021-11-02T01:55:17Z) - Multilingual transfer of acoustic word embeddings improves when training
on languages related to the target zero-resource language [32.170748231414365]
We show that training on even just a single related language gives the largest gain.
We also find that adding data from unrelated languages generally doesn't hurt performance.
arXiv Detail & Related papers (2021-06-24T08:37:05Z) - Transfer Learning based Speech Affect Recognition in Urdu [0.0]
We pre-train a model for high resource language affect recognition task and fine tune the parameters for low resource language.
This approach achieves high Unweighted Average Recall (UAR) when compared with existing algorithms.
arXiv Detail & Related papers (2021-03-05T10:30:58Z) - UniSpeech: Unified Speech Representation Learning with Labeled and
Unlabeled Data [54.733889961024445]
We propose a unified pre-training approach called UniSpeech to learn speech representations with both unlabeled and labeled data.
We evaluate the effectiveness of UniSpeech for cross-lingual representation learning on public CommonVoice corpus.
arXiv Detail & Related papers (2021-01-19T12:53:43Z) - Unsupervised Cross-lingual Representation Learning for Speech
Recognition [63.85924123692923]
XLSR learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages.
We build on wav2vec 2.0 which is trained by solving a contrastive task over masked latent speech representations.
Experiments show that cross-lingual pretraining significantly outperforms monolingual pretraining.
arXiv Detail & Related papers (2020-06-24T18:25:05Z) - Improving Cross-Lingual Transfer Learning for End-to-End Speech
Recognition with Speech Translation [63.16500026845157]
We introduce speech-to-text translation as an auxiliary task to incorporate additional knowledge of the target language.
We show that training ST with human translations is not necessary.
Even with pseudo-labels from low-resource MT (200K examples), ST-enhanced transfer brings up to 8.9% WER reduction to direct transfer.
arXiv Detail & Related papers (2020-06-09T19:34:11Z) - Meta-Transfer Learning for Code-Switched Speech Recognition [72.84247387728999]
We propose a new learning method, meta-transfer learning, to transfer learn on a code-switched speech recognition system in a low-resource setting.
Our model learns to recognize individual languages, and transfer them so as to better recognize mixed-language speech by conditioning the optimization on the code-switching data.
arXiv Detail & Related papers (2020-04-29T14:27:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.