Did Translation Models Get More Robust Without Anyone Even Noticing?
- URL: http://arxiv.org/abs/2403.03923v1
- Date: Wed, 6 Mar 2024 18:33:51 GMT
- Title: Did Translation Models Get More Robust Without Anyone Even Noticing?
- Authors: Ben Peters and Andr\'e F.T. Martins
- Abstract summary: We show that multilingual MT models and large language models (LLMs) are far more robust to many kinds of noise than previous models.
Similar trends hold for social media translation experiments -- LLMs are more robust to social media text.
- Score: 11.342084260983668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural machine translation (MT) models achieve strong results across a
variety of settings, but it is widely believed that they are highly sensitive
to "noisy" inputs, such as spelling errors, abbreviations, and other formatting
issues. In this paper, we revisit this insight in light of recent multilingual
MT models and large language models (LLMs) applied to machine translation.
Somewhat surprisingly, we show through controlled experiments that these models
are far more robust to many kinds of noise than previous models, even when they
perform similarly on clean data. This is notable because, even though LLMs have
more parameters and more complex training processes than past models, none of
the open ones we consider use any techniques specifically designed to encourage
robustness. Next, we show that similar trends hold for social media translation
experiments -- LLMs are more robust to social media text. We include an
analysis of the circumstances in which source correction techniques can be used
to mitigate the effects of noise. Altogether, we show that robustness to many
types of noise has increased.
Related papers
- Improving Language Models Trained on Translated Data with Continual Pre-Training and Dictionary Learning Analysis [3.16714407449467]
We investigate the role of translation and synthetic data in training language models.
We translate TinyStories, a dataset of 2.2M short stories for 3-4 year old children, from English to Arabic using the open NLLB-3B MT model.
To rectify these issues, we pre-train the models with a small dataset of synthesized high-quality Arabic stories.
arXiv Detail & Related papers (2024-05-23T07:53:04Z) - Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models [39.37532848489779]
We propose Error Norm Truncation (ENT), a robust enhancement method to the standard training objective that truncates noisy data.
We show that ENT improves generation quality over standard training and previous soft and hard truncation methods.
arXiv Detail & Related papers (2023-10-02T01:30:27Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - TIM: Teaching Large Language Models to Translate with Comparison [78.66926087162672]
We propose a novel framework using examples in comparison to teach LLMs to learn translation.
Our approach involves presenting the model with examples of correct and incorrect translations and using a preference loss to guide the model's learning.
Our findings offer a new perspective on fine-tuning LLMs for translation tasks and provide a promising solution for generating high-quality translations.
arXiv Detail & Related papers (2023-07-10T08:15:40Z) - An Investigation of Noise in Morphological Inflection [21.411766936034]
We investigate the types of noise encountered within a pipeline for truly unsupervised morphological paradigm completion.
We compare the effect of different types of noise on multiple state-of-the-art inflection models.
We propose a novel character-level masked language modeling (CMLM) pretraining objective and explore its impact on the models' resistance to noise.
arXiv Detail & Related papers (2023-05-26T02:14:34Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - What Do Compressed Multilingual Machine Translation Models Forget? [102.50127671423752]
We show that the performance of under-represented languages drops significantly, while the average BLEU metric only slightly decreases.
We demonstrate that compression amplifies intrinsic gender and semantic biases, even in high-resource languages.
arXiv Detail & Related papers (2022-05-22T13:54:44Z) - Improving Translation Robustness with Visual Cues and Error Correction [58.97421756225425]
We introduce the idea of visual context to improve translation robustness against noisy texts.
We also propose a novel error correction training regime by treating error correction as an auxiliary task.
arXiv Detail & Related papers (2021-03-12T15:31:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.