T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text
Classification
- URL: http://arxiv.org/abs/2306.04996v1
- Date: Thu, 8 Jun 2023 07:33:22 GMT
- Title: T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text
Classification
- Authors: Inigo Jauregi Unanue and Gholamreza Haffari and Massimo Piccardi
- Abstract summary: Cross-lingual text classification is typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest.
We propose revisiting the classic "translate-and-test" pipeline to neatly separate the translation and classification stages.
- Score: 50.675552118811
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cross-lingual text classification leverages text classifiers trained in a
high-resource language to perform text classification in other languages with
no or minimal fine-tuning (zero/few-shots cross-lingual transfer). Nowadays,
cross-lingual text classifiers are typically built on large-scale, multilingual
language models (LMs) pretrained on a variety of languages of interest.
However, the performance of these models vary significantly across languages
and classification tasks, suggesting that the superposition of the language
modelling and classification tasks is not always effective. For this reason, in
this paper we propose revisiting the classic "translate-and-test" pipeline to
neatly separate the translation and classification stages. The proposed
approach couples 1) a neural machine translator translating from the targeted
language to a high-resource language, with 2) a text classifier trained in the
high-resource language, but the neural machine translator generates "soft"
translations to permit end-to-end backpropagation during fine-tuning of the
pipeline. Extensive experiments have been carried out over three cross-lingual
text classification datasets (XNLI, MLDoc and MultiEURLEX), with the results
showing that the proposed approach has significantly improved performance over
a competitive baseline.
Related papers
- Using Machine Translation to Augment Multilingual Classification [0.0]
We explore the effects of using machine translation to fine-tune a multilingual model for a classification task across multiple languages.
We show that translated data are of sufficient quality to tune multilingual classifiers and that this novel loss technique is able to offer some improvement over models tuned without it.
arXiv Detail & Related papers (2024-05-09T00:31:59Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Automatic Discrimination of Human and Neural Machine Translation in
Multilingual Scenarios [4.631167282648452]
We tackle the task of automatically discriminating between human and machine translations.
We perform experiments in a multilingual setting, considering multiple languages and multilingual pretrained language models.
arXiv Detail & Related papers (2023-05-31T11:41:24Z) - Revisiting Machine Translation for Cross-lingual Classification [91.43729067874503]
Most research in the area focuses on the multilingual models rather than the Machine Translation component.
We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.
arXiv Detail & Related papers (2023-05-23T16:56:10Z) - Detecting Text Formality: A Study of Text Classification Approaches [78.11745751651708]
This work proposes the first to our knowledge systematic study of formality detection methods based on statistical, neural-based, and Transformer-based machine learning methods.
We conducted three types of experiments -- monolingual, multilingual, and cross-lingual.
The study shows the overcome of Char BiLSTM model over Transformer-based ones for the monolingual and multilingual formality classification task.
arXiv Detail & Related papers (2022-04-19T16:23:07Z) - Cross-lingual Text Classification with Heterogeneous Graph Neural
Network [2.6936806968297913]
Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages.
Recent multilingual pretrained language models (mPLM) achieve impressive results in cross-lingual classification tasks.
We propose a simple yet effective method to incorporate heterogeneous information within and across languages for cross-lingual text classification.
arXiv Detail & Related papers (2021-05-24T12:45:42Z) - Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language
Model [58.27176041092891]
Recent research indicates that pretraining cross-lingual language models on large-scale unlabeled texts yields significant performance improvements.
We propose a novel unsupervised feature decomposition method that can automatically extract domain-specific features from the entangled pretrained cross-lingual representations.
Our proposed model leverages mutual information estimation to decompose the representations computed by a cross-lingual model into domain-invariant and domain-specific parts.
arXiv Detail & Related papers (2020-11-23T16:00:42Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Testing pre-trained Transformer models for Lithuanian news clustering [0.0]
Non-English languages could not leverage such new opportunities with the English text pre-trained models.
We compare pre-trained multilingual BERT, XLM-R, and older learned text representation methods as encodings for the task of Lithuanian news clustering.
Our results indicate that publicly available pre-trained multilingual Transformer models can be fine-tuned to surpass word vectors but still score much lower than specially trained doc2vec embeddings.
arXiv Detail & Related papers (2020-04-03T14:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.