Multi-task Learning for Cross-Lingual Sentiment Analysis
- URL: http://arxiv.org/abs/2212.07160v1
- Date: Wed, 14 Dec 2022 11:29:03 GMT
- Title: Multi-task Learning for Cross-Lingual Sentiment Analysis
- Authors: Gaurish Thakkar, Nives Mikelic Preradovic, Marko Tadic
- Abstract summary: The study aims to classify the Croatian news articles with positive, negative, and neutral sentiments using the Slovene dataset.
The system is based on a trilingual BERT-based model trained in three languages: English, Slovene, Croatian.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a cross-lingual sentiment analysis of news articles using
zero-shot and few-shot learning. The study aims to classify the Croatian news
articles with positive, negative, and neutral sentiments using the Slovene
dataset. The system is based on a trilingual BERT-based model trained in three
languages: English, Slovene, Croatian. The paper analyses different setups
using datasets in two languages and proposes a simple multi-task model to
perform sentiment classification. The evaluation is performed using the
few-shot and zero-shot scenarios in single-task and multi-task experiments for
Croatian and Slovene.
Related papers
- SOI Matters: Analyzing Multi-Setting Training Dynamics in Pretrained Language Models via Subsets of Interest [5.882817862856554]
This work investigates the impact of multi-task, multi-lingual, and multi-source learning approaches on the robustness and performance of pretrained language models.<n>Subsets of Interest (SOI) identifies six distinct learning behavior patterns during training, including forgettable examples, unlearned examples, and always correct examples.<n>Our results demonstrate that multi-source learning consistently improves out-of-distribution performance by up to 7%, while multi-task learning shows mixed results with notable gains in similar task combinations.
arXiv Detail & Related papers (2025-07-21T04:43:21Z) - Cross-lingual Few-shot Learning for Persian Sentiment Analysis with Incremental Adaptation [0.0]
This research examines cross-lingual sentiment analysis using few-shot learning and incremental learning methods in Persian.<n>Three pre-trained multilingual models were employed, which were fine-tuned using few-shot and incremental learning approaches.<n> Experimental results show that the mDeBERTa and XLM-RoBERTa achieved high performances, reaching 96% accuracy on Persian sentiment analysis.
arXiv Detail & Related papers (2025-07-15T18:13:25Z) - Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models.
We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z) - Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task
Strategies for Genre and Framing Detection in Online News [10.435874177179764]
This paper explains the participation of team Hitachi to SemEval-2023 Task 3 "Detecting the genre, the framing, and the persuasion techniques in online news in a multi-lingual setup"
We investigated different cross-lingual and multi-task strategies for training the pretrained language models.
We constructed ensemble models from the results and achieved the highest macro-averaged F1 scores in Italian and Russian genre categorization subtasks.
arXiv Detail & Related papers (2023-03-03T09:12:55Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents [12.493662336994106]
We present an abstractive cross-lingual summarization dataset for four different languages in the scholarly domain.
We train and evaluate models that process English papers and generate summaries in German, Italian, Chinese and Japanese.
arXiv Detail & Related papers (2022-05-30T12:31:28Z) - Models and Datasets for Cross-Lingual Summarisation [78.56238251185214]
We present a cross-lingual summarisation corpus with long documents in a source language associated with multi-sentence summaries in a target language.
The corpus covers twelve language pairs and directions for four European languages, namely Czech, English, French and German.
We derive cross-lingual document-summary instances from Wikipedia by combining lead paragraphs and articles' bodies from language aligned Wikipedia titles.
arXiv Detail & Related papers (2022-02-19T11:55:40Z) - IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark.
IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages.
We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z) - A cost-benefit analysis of cross-lingual transfer methods [2.2220235662428425]
Cross-lingual transfer involves fine-tune a bilingual or multilingual model on a supervised dataset in one language and evaluating it on another language in a zero-shot manner.
We analyze cross-lingual methods in terms of their effectiveness, development and deployment costs, as well as their latencies at inference time.
By combining zero-shot and translation methods, we achieve the state-of-the-art in two of the three datasets used in this work.
arXiv Detail & Related papers (2021-05-14T13:21:12Z) - InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language
Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training.
We propose a new pre-training task based on contrastive learning.
By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z) - XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks.
We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.