Improving Spoken Language Identification with Map-Mix
- URL: http://arxiv.org/abs/2302.08229v1
- Date: Thu, 16 Feb 2023 11:27:46 GMT
- Title: Improving Spoken Language Identification with Map-Mix
- Authors: Shangeth Rajaa, Kriti Anandan, Swaraj Dalmia, Tarun Gupta, Eng Siong
Chng
- Abstract summary: The pre-trained multi-lingual XLSR model generalizes well for language identification after fine-tuning on unseen languages.
Low resource dialect classification remains a challenging problem to solve.
We present a new data augmentation method that leverages model training dynamics of individual data points to improve sampling for latent mixup.
- Score: 16.40412419504484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The pre-trained multi-lingual XLSR model generalizes well for language
identification after fine-tuning on unseen languages. However, the performance
significantly degrades when the languages are not very distinct from each
other, for example, in the case of dialects. Low resource dialect
classification remains a challenging problem to solve. We present a new data
augmentation method that leverages model training dynamics of individual data
points to improve sampling for latent mixup. The method works well in
low-resource settings where generalization is paramount. Our datamaps-based
mixup technique, which we call Map-Mix improves weighted F1 scores by 2%
compared to the random mixup baseline and results in a significantly
well-calibrated model. The code for our method is open sourced on
https://github.com/skit-ai/Map-Mix.
Related papers
- Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Locale Encoding For Scalable Multilingual Keyword Spotting Models [8.385848547707953]
We propose two locale-conditioned universalmodels with locale feature concatenation and feature-wise linearmodulation.
FiLM performed the best, improving on average FRRby 61% (relative) compared to monolingual KWS models of similarsizes.
arXiv Detail & Related papers (2023-02-25T02:20:59Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Non-Linear Pairwise Language Mappings for Low-Resource Multilingual
Acoustic Model Fusion [26.728287476234538]
hybrid DNN-HMM acoustic models fusion is proposed in a multilingual setup for the low-resource languages.
Posterior distributions from different monolingual acoustic models against a target language speech signal are fused together.
A separate regression neural network is trained for each source-target language pair to transform posteriors from source acoustic model to the target language.
arXiv Detail & Related papers (2022-07-07T15:56:50Z) - OneAligner: Zero-shot Cross-lingual Transfer with One Rich-Resource
Language Pair for Low-Resource Sentence Retrieval [91.76575626229824]
We present OneAligner, an alignment model specially designed for sentence retrieval tasks.
When trained with all language pairs of a large-scale parallel multilingual corpus (OPUS-100), this model achieves the state-of-the-art result.
We conclude through empirical results and analyses that the performance of the sentence alignment task depends mostly on the monolingual and parallel data size.
arXiv Detail & Related papers (2022-05-17T19:52:42Z) - From Good to Best: Two-Stage Training for Cross-lingual Machine Reading
Comprehension [51.953428342923885]
We develop a two-stage approach to enhance the model performance.
The first stage targets at recall: we design a hard-learning (HL) algorithm to maximize the likelihood that the top-k predictions contain the accurate answer.
The second stage focuses on precision: an answer-aware contrastive learning mechanism is developed to learn the fine difference between the accurate answer and other candidates.
arXiv Detail & Related papers (2021-12-09T07:31:15Z) - Cross-lingual alignments of ELMo contextual embeddings [0.0]
Cross-lingual embeddings map word embeddings from a low-resource language to a high-resource language.
To produce cross-lingual mappings of recent contextual embeddings, anchor points between the embedding spaces have to be words in the same context.
We propose novel cross-lingual mapping methods for ELMo embeddings.
arXiv Detail & Related papers (2021-06-30T11:26:43Z) - Code Switching Language Model Using Monolingual Training Data [0.0]
Training a code-switching (CS) language model using only monolingual data is still an ongoing research problem.
In this work, an RNN language model is trained using alternate batches from only monolingual English and Spanish data.
Results were consistently improved using mean square error (MSE) in the output embeddings of RNN based language model.
arXiv Detail & Related papers (2020-12-23T08:56:39Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.