Locale Encoding For Scalable Multilingual Keyword Spotting Models
- URL: http://arxiv.org/abs/2302.12961v1
- Date: Sat, 25 Feb 2023 02:20:59 GMT
- Title: Locale Encoding For Scalable Multilingual Keyword Spotting Models
- Authors: Pai Zhu, Hyun Jin Park, Alex Park, Angelo Scorza Scarpati, Ignacio
Lopez Moreno
- Abstract summary: We propose two locale-conditioned universalmodels with locale feature concatenation and feature-wise linearmodulation.
FiLM performed the best, improving on average FRRby 61% (relative) compared to monolingual KWS models of similarsizes.
- Score: 8.385848547707953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A Multilingual Keyword Spotting (KWS) system detects spokenkeywords over
multiple locales. Conventional monolingual KWSapproaches do not scale well to
multilingual scenarios because ofhigh development/maintenance costs and lack of
resource sharing.To overcome this limit, we propose two locale-conditioned
universalmodels with locale feature concatenation and feature-wise
linearmodulation (FiLM). We compare these models with two baselinemethods:
locale-specific monolingual KWS, and a single universalmodel trained over all
data. Experiments over 10 localized languagedatasets show that
locale-conditioned models substantially improveaccuracy over baseline methods
across all locales in different noiseconditions.FiLMperformed the best,
improving on average FRRby 61% (relative) compared to monolingual KWS models of
similarsizes.
Related papers
- Improving Spoken Language Identification with Map-Mix [16.40412419504484]
The pre-trained multi-lingual XLSR model generalizes well for language identification after fine-tuning on unseen languages.
Low resource dialect classification remains a challenging problem to solve.
We present a new data augmentation method that leverages model training dynamics of individual data points to improve sampling for latent mixup.
arXiv Detail & Related papers (2023-02-16T11:27:46Z) - Zero-shot Cross-lingual Transfer is Under-specified Optimization [49.3779328255767]
We show that any linear-interpolated model between the source language monolingual model and source + target bilingual model has equally low source language generalization error.
We also show that zero-shot solution lies in non-flat region of target language error generalization surface, causing the high variance.
arXiv Detail & Related papers (2022-07-12T16:49:28Z) - OneAligner: Zero-shot Cross-lingual Transfer with One Rich-Resource
Language Pair for Low-Resource Sentence Retrieval [91.76575626229824]
We present OneAligner, an alignment model specially designed for sentence retrieval tasks.
When trained with all language pairs of a large-scale parallel multilingual corpus (OPUS-100), this model achieves the state-of-the-art result.
We conclude through empirical results and analyses that the performance of the sentence alignment task depends mostly on the monolingual and parallel data size.
arXiv Detail & Related papers (2022-05-17T19:52:42Z) - Cross-Lingual Text Classification with Multilingual Distillation and
Zero-Shot-Aware Training [21.934439663979663]
Multi-branch multilingual language model (MBLM) built on Multilingual pre-trained language models (MPLMs)
Method based on transferring knowledge from high-performance monolingual models with a teacher-student framework.
Results on two cross-lingual classification tasks show that, with only the task's supervised data used, our method improves both the supervised and zero-shot performance of MPLMs.
arXiv Detail & Related papers (2022-02-28T09:51:32Z) - A Conditional Generative Matching Model for Multi-lingual Reply
Suggestion [23.750966630981623]
We study the problem of multilingual automated reply suggestions (RS) model serving many languages simultaneously.
We propose Conditional Generative Matching models (CGM) optimized within a Variational Autoencoder framework to address challenges arising from multi-lingual RS.
arXiv Detail & Related papers (2021-09-15T01:54:41Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z) - Cross-lingual Spoken Language Understanding with Regularized
Representation Alignment [71.53159402053392]
We propose a regularization approach to align word-level and sentence-level representations across languages without any external resource.
Experiments on the cross-lingual spoken language understanding task show that our model outperforms current state-of-the-art methods in both few-shot and zero-shot scenarios.
arXiv Detail & Related papers (2020-09-30T08:56:53Z) - Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank [46.626315158735615]
Pretrained multilingual contextual representations have shown great success, but due to the limits of their pretraining data, their benefits do not apply equally to all language varieties.
This presents a challenge for language varieties unfamiliar to these models, whose labeled emphand unlabeled data is too limited to train a monolingual model effectively.
We propose the use of additional language-specific pretraining and vocabulary augmentation to adapt multilingual models to low-resource settings.
arXiv Detail & Related papers (2020-09-29T16:12:52Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z) - Structure-Level Knowledge Distillation For Multilingual Sequence
Labeling [73.40368222437912]
We propose to reduce the gap between monolingual models and the unified multilingual model by distilling the structural knowledge of several monolingual models to the unified multilingual model (student)
Our experiments on 4 multilingual tasks with 25 datasets show that our approaches outperform several strong baselines and have stronger zero-shot generalizability than both the baseline model and teacher models.
arXiv Detail & Related papers (2020-04-08T07:14:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.