Narrowing the Gap between Zero- and Few-shot Machine Translation by
Matching Styles
- URL: http://arxiv.org/abs/2311.02310v1
- Date: Sat, 4 Nov 2023 03:18:45 GMT
- Title: Narrowing the Gap between Zero- and Few-shot Machine Translation by
Matching Styles
- Authors: Weiting Tan, Haoran Xu, Lingfeng Shen, Shuyue Stella Li, Kenton
Murray, Philipp Koehn, Benjamin Van Durme, Yunmo Chen
- Abstract summary: Large language models have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning.
In this paper, we investigate the factors contributing to this gap and find that this gap can largely be closed (for about 70%) by matching the writing styles of the target corpus.
- Score: 53.92189950211852
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models trained primarily in a monolingual setting have
demonstrated their ability to generalize to machine translation using zero- and
few-shot examples with in-context learning. However, even though zero-shot
translations are relatively good, there remains a discernible gap comparing
their performance with the few-shot setting. In this paper, we investigate the
factors contributing to this gap and find that this gap can largely be closed
(for about 70%) by matching the writing styles of the target corpus.
Additionally, we explore potential approaches to enhance zero-shot baselines
without the need for parallel demonstration examples, providing valuable
insights into how these methods contribute to improving translation metrics.
Related papers
- The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights [108.40766216456413]
We propose a question alignment framework to bridge the gap between large language models' English and non-English performance.
Experiment results show it can boost multilingual performance across diverse reasoning scenarios, model families, and sizes.
We analyze representation space, generated response and data scales, and reveal how question translation training strengthens language alignment within LLMs.
arXiv Detail & Related papers (2024-05-02T14:49:50Z) - How Far Can 100 Samples Go? Unlocking Overall Zero-Shot Multilingual
Translation via Tiny Multi-Parallel Data [10.286714403840355]
A common, albeit resource-consuming, solution is to add as many related translation directions as possible to the training corpus.
We show that for an English-centric model, surprisingly large zero-shot improvements can be achieved by simply fine-tuning with a very small amount of multi-parallel data.
arXiv Detail & Related papers (2024-01-22T23:55:00Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Prompting Large Language Model for Machine Translation: A Case Study [87.88120385000666]
We offer a systematic study on prompting strategies for machine translation.
We examine factors for prompt template and demonstration example selection.
We explore the use of monolingual data and the feasibility of cross-lingual, cross-domain, and sentence-to-document transfer learning.
arXiv Detail & Related papers (2023-01-17T18:32:06Z) - Aligned Weight Regularizers for Pruning Pretrained Neural Networks [6.000551438232907]
We show that there is a clear performance discrepancy in magnitude-based pruning when comparing standard supervised learning to the zero-shot setting.
We propose two weight regularizers that aim to maximize the alignment between units of pruned and unpruned networks.
arXiv Detail & Related papers (2022-04-04T11:06:42Z) - On The Ingredients of an Effective Zero-shot Semantic Parser [95.01623036661468]
We analyze zero-shot learning by paraphrasing training examples of canonical utterances and programs from a grammar.
We propose bridging these gaps using improved grammars, stronger paraphrasers, and efficient learning methods.
Our model achieves strong performance on two semantic parsing benchmarks (Scholar, Geo) with zero labeled data.
arXiv Detail & Related papers (2021-10-15T21:41:16Z) - Rethinking Zero-shot Neural Machine Translation: From a Perspective of
Latent Variables [28.101782382170306]
We introduce a denoising autoencoder objective based on pivot language into traditional training objective to improve the translation accuracy on zero-shot directions.
We demonstrate that the proposed method is able to effectively eliminate the spurious correlations and significantly outperforms state-of-the-art methods with a remarkable performance.
arXiv Detail & Related papers (2021-09-10T07:18:53Z) - Event Guided Denoising for Multilingual Relation Learning [2.4192504570921627]
We present a methodology for collecting high quality training data for relation extraction from unlabeled text.
Our approach exploits the predictable distributional structure of date-marked news articles to build a denoised corpus.
We show that a smaller multilingual encoder trained on this corpus performs comparably to the current state-of-the-art.
arXiv Detail & Related papers (2020-12-04T17:11:04Z) - Subword Segmentation and a Single Bridge Language Affect Zero-Shot
Neural Machine Translation [36.4055239280145]
We investigate zero-shot performance of a multilingual EN$leftrightarrow$FR,CS,DE,FI system trained on WMT data.
We observe a bias towards copying the source in zero-shot translation, and investigate how the choice of subword segmentation affects this bias.
arXiv Detail & Related papers (2020-11-03T13:45:54Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.