Internal Language Model Estimation based Adaptive Language Model Fusion
for Domain Adaptation
- URL: http://arxiv.org/abs/2211.00968v1
- Date: Wed, 2 Nov 2022 09:15:20 GMT
- Title: Internal Language Model Estimation based Adaptive Language Model Fusion
for Domain Adaptation
- Authors: Rao Ma, Xiaobo Wu, Jin Qiu, Yanan Qin, Haihua Xu, Peihao Wu, Zejun Ma
- Abstract summary: We propose an adaptive LM fusion approach called internal language model estimation based adaptive domain adaptation (ILME-ADA)
We demonstrate the efficacy of the proposed ILME-ADA method with both RNN-T and LAS modeling frameworks employing neural network and n-gram LMs as ELMs respectively on two domain specific (target) test sets.
- Score: 12.239557608053156
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: ASR model deployment environment is ever-changing, and the incoming speech
can be switched across different domains during a session. This brings a
challenge for effective domain adaptation when only target domain text data is
available, and our objective is to obtain obviously improved performance on the
target domain while the performance on the general domain is less undermined.
In this paper, we propose an adaptive LM fusion approach called internal
language model estimation based adaptive domain adaptation (ILME-ADA). To
realize such an ILME-ADA, an interpolated log-likelihood score is calculated
based on the maximum of the scores from the internal LM and the external LM
(ELM) respectively. We demonstrate the efficacy of the proposed ILME-ADA method
with both RNN-T and LAS modeling frameworks employing neural network and n-gram
LMs as ELMs respectively on two domain specific (target) test sets. The
proposed method can achieve significantly better performance on the target test
sets while it gets minimal performance degradation on the general test set,
compared with both shallow and ILME-based LM fusion methods.
Related papers
- Self-Exploring Language Models: Active Preference Elicitation for Online Alignment [88.56809269990625]
We propose a bilevel objective optimistically biased towards potentially high-reward responses to actively explore out-of-distribution regions.
Our experimental results demonstrate that when fine-tuned on Zephyr-7B-SFT and Llama-3-8B-Instruct models, Self-Exploring Language Models (SELM) significantly boosts the performance on instruction-following benchmarks.
arXiv Detail & Related papers (2024-05-29T17:59:07Z) - Unified Language-driven Zero-shot Domain Adaptation [55.64088594551629]
Unified Language-driven Zero-shot Domain Adaptation (ULDA) is a novel task setting.
It enables a single model to adapt to diverse target domains without explicit domain-ID knowledge.
arXiv Detail & Related papers (2024-04-10T16:44:11Z) - Dynamic Domain Discrepancy Adjustment for Active Multi-Domain Adaptation [3.367755441623275]
Multi-source unsupervised domain adaptation (MUDA) aims to transfer knowledge from related source domains to an unlabeled target domain.
We propose a novel approach called Dynamic Domain Discrepancy Adjustment for Active Multi-Domain Adaptation (D3AAMDA)
This mechanism controls the alignment level of features between each source domain and the target domain, effectively leveraging the local advantageous feature information within the source domains.
arXiv Detail & Related papers (2023-07-26T09:40:19Z) - Divide and Adapt: Active Domain Adaptation via Customized Learning [56.79144758380419]
We present Divide-and-Adapt (DiaNA), a new ADA framework that partitions the target instances into four categories with stratified transferable properties.
With a novel data subdivision protocol based on uncertainty and domainness, DiaNA can accurately recognize the most gainful samples.
Thanks to the "divideand-adapt" spirit, DiaNA can handle data with large variations of domain gap.
arXiv Detail & Related papers (2023-07-21T14:37:17Z) - Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR
with Internal Language Model Estimation [14.840612036671734]
Internal language model estimation (ILME) has been proposed to mitigate this bias for autoregressive models.
We propose a novel ILME technique for CTC-based ASR models.
Our method iteratively masks the audio timesteps to estimate a pseudo log-likelihood of the internal LM.
arXiv Detail & Related papers (2023-05-05T20:35:42Z) - IDA: Informed Domain Adaptive Semantic Segmentation [51.12107564372869]
We propose an Domain Informed Adaptation (IDA) model, a self-training framework that mixes the data based on class-level segmentation performance.
In our IDA model, the class-level performance is tracked by an expected confidence score (ECS) and we then use a dynamic schedule to determine the mixing ratio for data in different domains.
Our proposed method is able to outperform the state-of-the-art UDA-SS method by a margin of 1.1 mIoU in the adaptation of GTA-V to Cityscapes and of 0.9 mIoU in the adaptation of SYNTHIA to City
arXiv Detail & Related papers (2023-03-05T18:16:34Z) - On Language Model Integration for RNN Transducer based Speech
Recognition [49.84285563767935]
We study various ILM correction-based LM integration methods formulated in a common RNN-T framework.
We provide a decoding interpretation on two major reasons for performance improvement with ILM correction.
We also propose an exact-ILM training framework by extending the proof given in the hybrid autoregressive transducer.
arXiv Detail & Related papers (2021-10-13T16:30:46Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.