Language-specific Characteristic Assistance for Code-switching Speech
Recognition
- URL: http://arxiv.org/abs/2206.14580v1
- Date: Wed, 29 Jun 2022 13:39:51 GMT
- Title: Language-specific Characteristic Assistance for Code-switching Speech
Recognition
- Authors: Tongtong Song, Qiang Xu, Meng Ge, Longbiao Wang, Hao Shi, Yongjie Lv,
Yuqin Lin, Jianwu Dang
- Abstract summary: Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-switching speech recognition.
Existing methods have no language constraints on LSEs and underutilize language-specific knowledge of LSMs.
We propose a language-specific characteristic assistance (LSCA) method to mitigate the above problems.
- Score: 42.32330582682405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dual-encoder structure successfully utilizes two language-specific encoders
(LSEs) for code-switching speech recognition. Because LSEs are initialized by
two pre-trained language-specific models (LSMs), the dual-encoder structure can
exploit sufficient monolingual data and capture the individual language
attributes. However, existing methods have no language constraints on LSEs and
underutilize language-specific knowledge of LSMs. In this paper, we propose a
language-specific characteristic assistance (LSCA) method to mitigate the above
problems. Specifically, during training, we introduce two language-specific
losses as language constraints and generate corresponding language-specific
targets for them. During decoding, we take the decoding abilities of LSMs into
account by combining the output probabilities of two LSMs and the mixture model
to obtain the final predictions. Experiments show that either the training or
decoding method of LSCA can improve the model's performance. Furthermore, the
best result can obtain up to 15.4% relative error reduction on the
code-switching test set by combining the training and decoding methods of LSCA.
Moreover, the system can process code-switching speech recognition tasks well
without extra shared parameters or even retraining based on two pre-trained
LSMs by using our method.
Related papers
- Code-mixed LLM: Improve Large Language Models' Capability to Handle Code-Mixing through Reinforcement Learning from AI Feedback [11.223762031003671]
Code-mixing introduces unique challenges in daily life, such as syntactic mismatches and semantic blending.
Large language models (LLMs) have revolutionized the field of natural language processing (NLP) by offering unprecedented capabilities in understanding human languages.
We propose to improve the multilingual LLMs' ability to understand code-mixing through reinforcement learning from human feedback (RLHF) and code-mixed machine translation tasks.
arXiv Detail & Related papers (2024-11-13T22:56:00Z) - Crystal: Illuminating LLM Abilities on Language and Code [58.5467653736537]
We propose a pretraining strategy to enhance the integration of natural language and coding capabilities.
The resulting model, Crystal, demonstrates remarkable capabilities in both domains.
arXiv Detail & Related papers (2024-11-06T10:28:46Z) - Large Language Models are Interpretable Learners [53.56735770834617]
In this paper, we show a combination of Large Language Models (LLMs) and symbolic programs can bridge the gap between expressiveness and interpretability.
The pretrained LLM with natural language prompts provides a massive set of interpretable modules that can transform raw input into natural language concepts.
As the knowledge learned by LSP is a combination of natural language descriptions and symbolic rules, it is easily transferable to humans (interpretable) and other LLMs.
arXiv Detail & Related papers (2024-06-25T02:18:15Z) - Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting [45.161909551392085]
We introduce an encoder prompting technique within the self-conditioned CTC framework, enabling language-specific adaptation of the CTC model in a zero-shot manner.
Our method has shown to significantly reduce errors by 28% on average and by 41% on low-resource languages.
arXiv Detail & Related papers (2024-06-18T13:38:58Z) - Speculative Contrastive Decoding [55.378200871224074]
Large language models(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias.
Inspired by speculative decoding and contrastive decoding, we introduce Speculative Contrastive Decoding(SCD), a straightforward yet powerful decoding approach.
arXiv Detail & Related papers (2023-11-15T14:15:30Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - LAE: Language-Aware Encoder for Monolingual and Multilingual ASR [87.74794847245536]
A novel language-aware encoder (LAE) architecture is proposed to handle both situations by disentangling language-specific information.
Experiments conducted on Mandarin-English code-switched speech suggest that the proposed LAE is capable of discriminating different languages in frame-level.
arXiv Detail & Related papers (2022-06-05T04:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.