Related papers: Multi-task Learning for Cross-Lingual Sentiment Analysis

Multi-task Learning for Cross-Lingual Sentiment Analysis

URL: http://arxiv.org/abs/2212.07160v1
Date: Wed, 14 Dec 2022 11:29:03 GMT
Title: Multi-task Learning for Cross-Lingual Sentiment Analysis
Authors: Gaurish Thakkar, Nives Mikelic Preradovic, Marko Tadic
Abstract summary: The study aims to classify the Croatian news articles with positive, negative, and neutral sentiments using the Slovene dataset. The system is based on a trilingual BERT-based model trained in three languages: English, Slovene, Croatian.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a cross-lingual sentiment analysis of news articles using zero-shot and few-shot learning. The study aims to classify the Croatian news articles with positive, negative, and neutral sentiments using the Slovene dataset. The system is based on a trilingual BERT-based model trained in three languages: English, Slovene, Croatian. The paper analyses different setups using datasets in two languages and proposes a simple multi-task model to perform sentiment classification. The evaluation is performed using the few-shot and zero-shot scenarios in single-task and multi-task experiments for Croatian and Slovene.

Related papers

SOI Matters: Analyzing Multi-Setting Training Dynamics in Pretrained Language Models via Subsets of Interest [5.882817862856554]
This work investigates the impact of multi-task, multi-lingual, and multi-source learning approaches on the robustness and performance of pretrained language models.<n>Subsets of Interest (SOI) identifies six distinct learning behavior patterns during training, including forgettable examples, unlearned examples, and always correct examples.<n>Our results demonstrate that multi-source learning consistently improves out-of-distribution performance by up to 7%, while multi-task learning shows mixed results with notable gains in similar task combinations.
arXiv Detail & Related papers (2025-07-21T04:43:21Z)
Cross-lingual Few-shot Learning for Persian Sentiment Analysis with Incremental Adaptation [0.0]
This research examines cross-lingual sentiment analysis using few-shot learning and incremental learning methods in Persian.<n>Three pre-trained multilingual models were employed, which were fine-tuned using few-shot and incremental learning approaches.<n> Experimental results show that the mDeBERTa and XLM-RoBERTa achieved high performances, reaching 96% accuracy on Persian sentiment analysis.
arXiv Detail & Related papers (2025-07-15T18:13:25Z)
Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models. We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z)
Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task Strategies for Genre and Framing Detection in Online News [10.435874177179764]
This paper explains the participation of team Hitachi to SemEval-2023 Task 3 "Detecting the genre, the framing, and the persuasion techniques in online news in a multi-lingual setup" We investigated different cross-lingual and multi-task strategies for training the pretrained language models. We constructed ensemble models from the results and achieved the highest macro-averaged F1 scores in Italian and Russian genre categorization subtasks.
arXiv Detail & Related papers (2023-03-03T09:12:55Z)
Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data. We design a simple but effective ensemble-based framework that combines various transfer learning techniques. We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z)
X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents [12.493662336994106]
We present an abstractive cross-lingual summarization dataset for four different languages in the scholarly domain. We train and evaluate models that process English papers and generate summaries in German, Italian, Chinese and Japanese.
arXiv Detail & Related papers (2022-05-30T12:31:28Z)
Models and Datasets for Cross-Lingual Summarisation [78.56238251185214]
We present a cross-lingual summarisation corpus with long documents in a source language associated with multi-sentence summaries in a target language. The corpus covers twelve language pairs and directions for four European languages, namely Czech, English, French and German. We derive cross-lingual document-summary instances from Wikipedia by combining lead paragraphs and articles' bodies from language aligned Wikipedia titles.
arXiv Detail & Related papers (2022-02-19T11:55:40Z)
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark. IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z)
A cost-benefit analysis of cross-lingual transfer methods [2.2220235662428425]
Cross-lingual transfer involves fine-tune a bilingual or multilingual model on a supervised dataset in one language and evaluating it on another language in a zero-shot manner. We analyze cross-lingual methods in terms of their effectiveness, development and deployment costs, as well as their latencies at inference time. By combining zero-shot and translation methods, we achieve the state-of-the-art in two of the three datasets used in this work.
arXiv Detail & Related papers (2021-05-14T13:21:12Z)
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training. We propose a new pre-training task based on contrastive learning. By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z)
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks. We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.