Reference Free Domain Adaptation for Translation of Noisy Questions with
Question Specific Rewards
- URL: http://arxiv.org/abs/2310.15259v1
- Date: Mon, 23 Oct 2023 18:08:01 GMT
- Title: Reference Free Domain Adaptation for Translation of Noisy Questions with
Question Specific Rewards
- Authors: Baban Gain, Ramakrishna Appicharla, Soumya Chennabasavaraj, Nikesh
Garera, Asif Ekbal, Muthusamy Chelliah
- Abstract summary: Translating questions using Neural Machine Translation poses more challenges in noisy environments.
We propose a training methodology that fine-tunes the NMT system only using source-side data.
Our approach balances adequacy and fluency by utilizing a loss function that combines BERTScore and Masked Language Model (MLM) Score.
- Score: 22.297433705607464
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Community Question-Answering (CQA) portals serve as a valuable tool for
helping users within an organization. However, making them accessible to
non-English-speaking users continues to be a challenge. Translating questions
can broaden the community's reach, benefiting individuals with similar
inquiries in various languages. Translating questions using Neural Machine
Translation (NMT) poses more challenges, especially in noisy environments,
where the grammatical correctness of the questions is not monitored. These
questions may be phrased as statements by non-native speakers, with incorrect
subject-verb order and sometimes even missing question marks. Creating a
synthetic parallel corpus from such data is also difficult due to its noisy
nature. To address this issue, we propose a training methodology that
fine-tunes the NMT system only using source-side data. Our approach balances
adequacy and fluency by utilizing a loss function that combines BERTScore and
Masked Language Model (MLM) Score. Our method surpasses the conventional
Maximum Likelihood Estimation (MLE) based fine-tuning approach, which relies on
synthetic target data, by achieving a 1.9 BLEU score improvement. Our model
exhibits robustness while we add noise to our baseline, and still achieve 1.1
BLEU improvement and large improvements on TER and BLEURT metrics. Our proposed
methodology is model-agnostic and is only necessary during the training phase.
We make the codes and datasets publicly available at
\url{https://www.iitp.ac.in/~ai-nlp-ml/resources.html#DomainAdapt} for
facilitating further research.
Related papers
- LANDeRMT: Detecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation [43.26446958873554]
Large language models (LLMs) have shown promising results in multilingual translation even with limited bilingual supervision.
Recent advancements in large language models (LLMs) have shown promising results in multilingual translation even with limited bilingual supervision.
LandeRMT is a framework that selectively finetunes LLMs to textbfMachine textbfTranslation with diverse translation training data.
arXiv Detail & Related papers (2024-09-29T02:39:42Z) - A Data Selection Approach for Enhancing Low Resource Machine Translation Using Cross-Lingual Sentence Representations [0.4499833362998489]
This study focuses on the case of English-Marathi language pairs, where existing datasets are notably noisy.
To mitigate the impact of data quality issues, we propose a data filtering approach based on cross-lingual sentence representations.
Results demonstrate a significant improvement in translation quality over the baseline post-filtering with IndicSBERT.
arXiv Detail & Related papers (2024-09-04T13:49:45Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation [48.80125962015044]
We investigate the problem of performing a generative task (i.e., summarization) in a target language when labeled data is only available in English.
We find that parameter-efficient adaptation provides gains over standard fine-tuning when transferring between less-related languages.
Our methods can provide further quality gains, suggesting that robust zero-shot cross-lingual generation is within reach.
arXiv Detail & Related papers (2022-05-25T10:41:34Z) - LaMDA: Language Models for Dialog Applications [75.75051929981933]
LaMDA is a family of Transformer-based neural language models specialized for dialog.
Fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements.
arXiv Detail & Related papers (2022-01-20T15:44:37Z) - Multilingual Unsupervised Neural Machine Translation with Denoising
Adapters [77.80790405710819]
We consider the problem of multilingual unsupervised machine translation, translating to and from languages that only have monolingual data.
For this problem the standard procedure so far to leverage the monolingual data is back-translation, which is computationally costly and hard to tune.
In this paper we propose instead to use denoising adapters, adapter layers with a denoising objective, on top of pre-trained mBART-50.
arXiv Detail & Related papers (2021-10-20T10:18:29Z) - When Does Translation Require Context? A Data-driven, Multilingual
Exploration [71.43817945875433]
proper handling of discourse significantly contributes to the quality of machine translation (MT)
Recent works in context-aware MT attempt to target a small set of discourse phenomena during evaluation.
We develop the Multilingual Discourse-Aware benchmark, a series of taggers that identify and evaluate model performance on discourse phenomena.
arXiv Detail & Related papers (2021-09-15T17:29:30Z) - Majority Voting with Bidirectional Pre-translation For Bitext Retrieval [2.580271290008534]
A popular approach has been to mine so-called "pseudo-parallel" sentences from paired documents in two languages.
In this paper, we outline some problems with current methods, propose computationally economical solutions to those problems, and demonstrate success with novel methods.
We make the code and data used for our experiments publicly available.
arXiv Detail & Related papers (2021-03-10T22:24:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.