Robustification of Multilingual Language Models to Real-world Noise with
Robust Contrastive Pretraining
- URL: http://arxiv.org/abs/2210.04782v1
- Date: Mon, 10 Oct 2022 15:40:43 GMT
- Title: Robustification of Multilingual Language Models to Real-world Noise with
Robust Contrastive Pretraining
- Authors: Asa Cooper Stickland, Sailik Sengupta, Jason Krone, Saab Mansour, He
He
- Abstract summary: We assess the robustness of neural models on noisy data and suggest improvements are limited to the English language.
To benchmark the performance of pretrained multilingual models, we construct noisy datasets covering five languages and four NLP tasks.
We propose Robust Contrastive Pretraining (RCP) to boost the zero-shot cross-lingual robustness of multilingual pretrained models.
- Score: 14.087882550564169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advances in neural modeling have achieved state-of-the-art (SOTA) results on
public natural language processing (NLP) benchmarks, at times surpassing human
performance. However, there is a gap between public benchmarks and real-world
applications where noise such as typos or grammatical mistakes is abundant,
resulting in degraded performance. Unfortunately, works that assess the
robustness of neural models on noisy data and suggest improvements are limited
to the English language. Upon analyzing noise in different languages, we
observe that noise types vary across languages and thus require their own
investigation. Thus, to benchmark the performance of pretrained multilingual
models, we construct noisy datasets covering five languages and four NLP tasks.
We see a gap in performance between clean and noisy data. After investigating
ways to boost the zero-shot cross-lingual robustness of multilingual pretrained
models, we propose Robust Contrastive Pretraining (RCP). RCP combines data
augmentation with a contrastive loss term at the pretraining stage and achieves
large improvements on noisy (& original test data) across two sentence-level
classification (+3.2%) and two sequence-labeling (+10 F1-score) multilingual
tasks.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.