Evaluating the Capabilities of Large Language Models for Multi-label Emotion Understanding
- URL: http://arxiv.org/abs/2412.17837v2
- Date: Fri, 03 Jan 2025 12:32:35 GMT
- Title: Evaluating the Capabilities of Large Language Models for Multi-label Emotion Understanding
- Authors: Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Grigori Sidorov, Dietrich Klakow, Philipp Slusallek, Olga Kolesnikova, Seid Muhie Yimam,
- Abstract summary: We present EthioEmo, a multi-label emotion classification dataset for four Ethiopian languages.
We perform extensive experiments with an additional English multi-label emotion dataset from SemEval 2018 Task 1.
The results show that accurate multi-label emotion classification is still insufficient even for high-resource languages.
- Score: 20.581470997286146
- License:
- Abstract: Large Language Models (LLMs) show promising learning and reasoning abilities. Compared to other NLP tasks, multilingual and multi-label emotion evaluation tasks are under-explored in LLMs. In this paper, we present EthioEmo, a multi-label emotion classification dataset for four Ethiopian languages, namely, Amharic (amh), Afan Oromo (orm), Somali (som), and Tigrinya (tir). We perform extensive experiments with an additional English multi-label emotion dataset from SemEval 2018 Task 1. Our evaluation includes encoder-only, encoder-decoder, and decoder-only language models. We compare zero and few-shot approaches of LLMs to fine-tuning smaller language models. The results show that accurate multi-label emotion classification is still insufficient even for high-resource languages such as English, and there is a large gap between the performance of high-resource and low-resource languages. The results also show varying performance levels depending on the language and model type. EthioEmo is available publicly to further improve the understanding of emotions in language models and how people convey emotions through various languages.
Related papers
- BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages [93.92804151830744]
We present BRIGHTER, a collection of emotion-annotated datasets in 28 different languages.
We describe the data collection and annotation processes and the challenges of building these datasets.
We show that BRIGHTER datasets are a step towards bridging the gap in text-based emotion recognition.
arXiv Detail & Related papers (2025-02-17T15:39:50Z) - Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.
We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.
We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model [14.39119862985503]
We aim to create a multilingual ALT system with available datasets.
Inspired by architectures that have been proven effective for English ALT, we adapt these techniques to the multilingual scenario.
We evaluate the performance of the multilingual model in comparison to its monolingual counterparts.
arXiv Detail & Related papers (2024-06-25T15:02:32Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - Multilingual Large Language Models and Curse of Multilinguality [4.096453902709292]
Large Language Models (LLMs) have gained large popularity among Natural Language Processing (NLP) researchers and practitioners.
This paper navigates the landscape of multilingual LLMs, providing an introductory overview of their technical aspects.
It explains underlying architectures, objective functions, pre-training data sources, and tokenization methods.
arXiv Detail & Related papers (2024-06-15T11:31:39Z) - Parrot: Multilingual Visual Instruction Tuning [66.65963606552839]
Existing methods mainly focus on aligning vision encoders with Multimodal Large Language Models (MLLMs)
We introduce Parrot, a novel method that utilizes textual guidance to drive visual token alignment at the language level.
Our method not only demonstrates state-of-the-art performance on multilingual MMBench and MMMB, but also excels across a broad range of multimodal tasks.
arXiv Detail & Related papers (2024-06-04T17:56:28Z) - EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation [24.060772057458685]
This paper introduces EthioLLM -- multilingual large language models for five Ethiopian languages (Amharic, Ge'ez, Afan Oromo, Somali, and Tigrinya) and English.
We evaluate the performance of these models across five downstream Natural Language Processing (NLP) tasks.
arXiv Detail & Related papers (2024-03-20T16:43:42Z) - Sociolinguistically Informed Interpretability: A Case Study on Hinglish
Emotion Classification [8.010713141364752]
We study the effect of language on emotion prediction across 3 PLMs on a Hinglish emotion classification dataset.
We find that models do learn these associations between language choice and emotional expression.
Having code-mixed data present in the pre-training can augment that learning when task-specific data is scarce.
arXiv Detail & Related papers (2024-02-05T16:05:32Z) - Zero-shot Sentiment Analysis in Low-Resource Languages Using a
Multilingual Sentiment Lexicon [78.12363425794214]
We focus on zero-shot sentiment analysis tasks across 34 languages, including 6 high/medium-resource languages, 25 low-resource languages, and 3 code-switching datasets.
We demonstrate that pretraining using multilingual lexicons, without using any sentence-level sentiment data, achieves superior zero-shot performance compared to models fine-tuned on English sentiment datasets.
arXiv Detail & Related papers (2024-02-03T10:41:05Z) - AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages
with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context.
It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts.
Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.