Related papers: Optimism, Expectation, or Sarcasm? Multi-Class Hope Speech Detection in Spanish and English

Optimism, Expectation, or Sarcasm? Multi-Class Hope Speech Detection in Spanish and English

URL: http://arxiv.org/abs/2504.17974v1
Date: Thu, 24 Apr 2025 23:00:46 GMT
Title: Optimism, Expectation, or Sarcasm? Multi-Class Hope Speech Detection in Spanish and English
Authors: Sabur Butt, Fazlourrahman Balouchzahi, Ahmad Imam Amjad, Maaz Amjad, Hector G. Ceballos, Salud Maria Jimenez-Zafra,
Abstract summary: This study introduces PolyHope V2, a multilingual, fine-grained hope speech dataset comprising over 30,000 annotated tweets in English and Spanish.<n>This resource distinguishes between four hope subtypes Generalized, Realistic, Unrealistic, and Sarcastic and enhances existing datasets by explicitly labeling sarcastic instances.
Score: 2.424469485586727
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Hope is a complex and underexplored emotional state that plays a significant role in education, mental health, and social interaction. Unlike basic emotions, hope manifests in nuanced forms ranging from grounded optimism to exaggerated wishfulness or sarcasm, making it difficult for Natural Language Processing systems to detect accurately. This study introduces PolyHope V2, a multilingual, fine-grained hope speech dataset comprising over 30,000 annotated tweets in English and Spanish. This resource distinguishes between four hope subtypes Generalized, Realistic, Unrealistic, and Sarcastic and enhances existing datasets by explicitly labeling sarcastic instances. We benchmark multiple pretrained transformer models and compare them with large language models (LLMs) such as GPT 4 and Llama 3 under zero-shot and few-shot regimes. Our findings show that fine-tuned transformers outperform prompt-based LLMs, especially in distinguishing nuanced hope categories and sarcasm. Through qualitative analysis and confusion matrices, we highlight systematic challenges in separating closely related hope subtypes. The dataset and results provide a robust foundation for future emotion recognition tasks that demand greater semantic and contextual sensitivity across languages.

Related papers

BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages [93.92804151830744]
We present BRIGHTER -- a collection of multi-labeled datasets in 28 different languages. We describe the data collection and annotation processes and the challenges of building these datasets. We show that BRIGHTER datasets are a step towards bridging the gap in text-based emotion recognition.
arXiv Detail & Related papers (2025-02-17T15:39:50Z)
Sarcasm Detection in a Less-Resourced Language [0.0]
We build a sarcasm detection dataset for a less-resourced language, such as Slovenian. We leverage two modern techniques: a machine translation specific medium-size transformer model, and a very large generative language model. The results show that larger models generally outperform smaller ones and that ensembling can slightly improve sarcasm detection performance.
arXiv Detail & Related papers (2024-10-16T16:10:59Z)
Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue [63.32199372362483]
We propose a novel sEntiment-enhanceD Graph-based multimodal sarcasm Explanation framework, named EDGE.<n>In particular, we first propose a lexicon-guided utterance sentiment inference module, where a utterance sentiment refinement strategy is devised.<n>We then develop a module named Joint Cross Attention-based Sentiment Inference (JCA-SI) by extending the multimodal sentiment analysis model JCA to derive the joint sentiment label for each video-audio clip.
arXiv Detail & Related papers (2024-02-06T03:14:46Z)
SAIDS: A Novel Approach for Sentiment Analysis Informed of Dialect and Sarcasm [0.0]
This paper introduces a novel system (SAIDS) that predicts the sentiment, sarcasm and dialect of Arabic tweets. By training all tasks together, SAIDS results of 75.98 FPN, 59.09 F1-score and 71.13 F1-score for sentiment analysis, sarcasm detection, and dialect identification respectively.
arXiv Detail & Related papers (2023-01-06T14:19:46Z)
Explaining (Sarcastic) Utterances to Enhance Affect Understanding in Multimodal Dialogues [40.80696210030204]
We propose MOSES, a deep neural network, which takes a multimodal (sarcastic) dialogue instance as an input and generates a natural language sentence as its explanation. We leverage the generated explanation for various natural language understanding tasks in a conversational dialogue setup, such as sarcasm detection, humour identification, and emotion recognition. Our evaluation shows that MOSES outperforms the state-of-the-art system for SED by an average of 2% on different evaluation metrics.
arXiv Detail & Related papers (2022-11-20T18:05:43Z)
PolyHope: Two-Level Hope Speech Detection from Tweets [68.8204255655161]
Despite its importance, hope has rarely been studied as a social media analysis task. This paper presents a hope speech dataset that classifies each tweet first into "Hope" and "Not Hope" English tweets in the first half of 2022 were collected to build this dataset.
arXiv Detail & Related papers (2022-10-25T16:34:03Z)
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark. IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z)
Exploiting BERT For Multimodal Target SentimentClassification Through Input Space Translation [75.82110684355979]
We introduce a two-stream model that translates images in input space using an object-aware transformer. We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model. We achieve state-of-the-art performance on two multimodal Twitter datasets.
arXiv Detail & Related papers (2021-08-03T18:02:38Z)
AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context. It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts. Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z)
Interpretable Multi-Head Self-Attention model for Sarcasm Detection in social media [0.0]
Inherent ambiguity in sarcastic expressions, make sarcasm detection very difficult. We develop an interpretable deep learning model using multi-head self-attention and gated recurrent units. We show the effectiveness of our approach by achieving state-of-the-art results on multiple datasets.
arXiv Detail & Related papers (2021-01-14T21:39:35Z)
VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages. It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language. The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.