Language-specific Characteristic Assistance for Code-switching Speech
Recognition
- URL: http://arxiv.org/abs/2206.14580v1
- Date: Wed, 29 Jun 2022 13:39:51 GMT
- Title: Language-specific Characteristic Assistance for Code-switching Speech
Recognition
- Authors: Tongtong Song, Qiang Xu, Meng Ge, Longbiao Wang, Hao Shi, Yongjie Lv,
Yuqin Lin, Jianwu Dang
- Abstract summary: Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-switching speech recognition.
Existing methods have no language constraints on LSEs and underutilize language-specific knowledge of LSMs.
We propose a language-specific characteristic assistance (LSCA) method to mitigate the above problems.
- Score: 42.32330582682405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dual-encoder structure successfully utilizes two language-specific encoders
(LSEs) for code-switching speech recognition. Because LSEs are initialized by
two pre-trained language-specific models (LSMs), the dual-encoder structure can
exploit sufficient monolingual data and capture the individual language
attributes. However, existing methods have no language constraints on LSEs and
underutilize language-specific knowledge of LSMs. In this paper, we propose a
language-specific characteristic assistance (LSCA) method to mitigate the above
problems. Specifically, during training, we introduce two language-specific
losses as language constraints and generate corresponding language-specific
targets for them. During decoding, we take the decoding abilities of LSMs into
account by combining the output probabilities of two LSMs and the mixture model
to obtain the final predictions. Experiments show that either the training or
decoding method of LSCA can improve the model's performance. Furthermore, the
best result can obtain up to 15.4% relative error reduction on the
code-switching test set by combining the training and decoding methods of LSCA.
Moreover, the system can process code-switching speech recognition tasks well
without extra shared parameters or even retraining based on two pre-trained
LSMs by using our method.
Related papers
- Crystal: Illuminating LLM Abilities on Language and Code [58.5467653736537]
We propose a pretraining strategy to enhance the integration of natural language and coding capabilities.
The resulting model, Crystal, demonstrates remarkable capabilities in both domains.
arXiv Detail & Related papers (2024-11-06T10:28:46Z) - Code-Switching Curriculum Learning for Multilingual Transfer in LLMs [43.85646680303273]
Large language models (LLMs) exhibit near human-level performance in various tasks, but their performance drops drastically after a handful of high-resource languages.
Inspired by the human process of second language acquisition, we propose code-switching curriculum learning (CSCL) to enhance cross-lingual transfer for LLMs.
CSCL mimics the stages of human language learning by progressively training models with a curriculum consisting of 1) token-level code-switching, 2) sentence-level code-switching, and 3) monolingual corpora.
arXiv Detail & Related papers (2024-11-04T06:31:26Z) - Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability.
The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts.
As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z) - Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting [45.161909551392085]
We introduce an encoder prompting technique within the self-conditioned CTC framework, enabling language-specific adaptation of the CTC model in a zero-shot manner.
Our method has shown to significantly reduce errors by 28% on average and by 41% on low-resource languages.
arXiv Detail & Related papers (2024-06-18T13:38:58Z) - Speculative Contrastive Decoding [55.378200871224074]
Large language models(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias.
Inspired by speculative decoding and contrastive decoding, we introduce Speculative Contrastive Decoding(SCD), a straightforward yet powerful decoding approach.
arXiv Detail & Related papers (2023-11-15T14:15:30Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - LAE: Language-Aware Encoder for Monolingual and Multilingual ASR [87.74794847245536]
A novel language-aware encoder (LAE) architecture is proposed to handle both situations by disentangling language-specific information.
Experiments conducted on Mandarin-English code-switched speech suggest that the proposed LAE is capable of discriminating different languages in frame-level.
arXiv Detail & Related papers (2022-06-05T04:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.