Related papers: Generalizable Sarcasm Detection Is Just Around The Corner, Of Course!

Generalizable Sarcasm Detection Is Just Around The Corner, Of Course!

URL: http://arxiv.org/abs/2404.06357v2
Date: Wed, 10 Apr 2024 07:48:08 GMT
Title: Generalizable Sarcasm Detection Is Just Around The Corner, Of Course!
Authors: Hyewon Jang, Diego Frassinelli,
Abstract summary: We tested the robustness of sarcasm detection models by examining their behavior when fine-tuned on four sarcasm datasets. For intra-dataset predictions, models consistently performed better when fine-tuned with third-party labels. For cross-dataset predictions, most models failed to generalize well to the other datasets.
Score: 3.1245838179647576
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We tested the robustness of sarcasm detection models by examining their behavior when fine-tuned on four sarcasm datasets containing varying characteristics of sarcasm: label source (authors vs. third-party), domain (social media/online vs. offline conversations/dialogues), style (aggressive vs. humorous mocking). We tested their prediction performance on the same dataset (intra-dataset) and across different datasets (cross-dataset). For intra-dataset predictions, models consistently performed better when fine-tuned with third-party labels rather than with author labels. For cross-dataset predictions, most models failed to generalize well to the other datasets, implying that one type of dataset cannot represent all sorts of sarcasm with different styles and domains. Compared to the existing datasets, models fine-tuned on the new dataset we release in this work showed the highest generalizability to other datasets. With a manual inspection of the datasets and post-hoc analysis, we attributed the difficulty in generalization to the fact that sarcasm actually comes in different domains and styles. We argue that future sarcasm research should take the broad scope of sarcasm into account.

Related papers

Leveraging Large Language Models for Sarcastic Speech Annotation in Sarcasm Detection [16.35106164874197]
Sarcasm fundamentally alters meaning through tone and context, yet detecting it in speech remains a challenge due to data scarcity.<n>We propose an annotation pipeline that leverages large language models (LLMs) to generate a sarcasm dataset.<n>We validate this approach by comparing annotation quality and detection performance on a publicly available sarcasm dataset.<n>Finally, we introduce PodSarc, a large-scale sarcastic speech dataset created through this pipeline.
arXiv Detail & Related papers (2025-06-01T11:00:18Z)
Sarcasm Detection in a Less-Resourced Language [0.0]
We build a sarcasm detection dataset for a less-resourced language, such as Slovenian. We leverage two modern techniques: a machine translation specific medium-size transformer model, and a very large generative language model. The results show that larger models generally outperform smaller ones and that ensembling can slightly improve sarcasm detection performance.
arXiv Detail & Related papers (2024-10-16T16:10:59Z)
Diffusion Models as Data Mining Tools [87.77999285241219]
This paper demonstrates how to use generative models trained for image synthesis as tools for visual data mining. We show that after finetuning conditional diffusion models to synthesize images from a specific dataset, we can use these models to define a typicality measure. This measure assesses how typical visual elements are for different data labels, such as geographic location, time stamps, semantic labels, or even the presence of a disease.
arXiv Detail & Related papers (2024-07-20T17:14:31Z)
KoCoSa: Korean Context-aware Sarcasm Detection Dataset [3.369750569233713]
Sarcasm is a way of verbal irony where someone says the opposite of what they mean, often to ridicule a person, situation, or idea. In this paper, we introduce a new dataset for the Korean dialogue sarcasm detection task, KoCoSa. The dataset consists of 12.8K daily Korean dialogues and the labels for this task on the last response.
arXiv Detail & Related papers (2024-02-22T10:17:57Z)
Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats [80.12253291709673]
We propose a novel affine-combining autoencoder (ACAE) method to perform dimensionality reduction on the number of landmarks. Our approach scales to an extreme multi-dataset regime, where we use 28 3D human pose datasets to supervise one model.
arXiv Detail & Related papers (2022-12-29T22:22:49Z)
Sarcasm Detection Framework Using Emotion and Sentiment Features [62.997667081978825]
We propose a model which incorporates emotion and sentiment features to capture the incongruity intrinsic to sarcasm. Our approach achieved state-of-the-art results on four datasets from social networking platforms and online media.
arXiv Detail & Related papers (2022-11-23T15:14:44Z)
Sarcasm Detection in Twitter -- Performance Impact when using Data Augmentation: Word Embeddings [0.0]
Sarcasm is the use of words usually used to either mock or annoy someone, or for humorous purposes. We propose a contextual model for sarcasm identification in twitter using RoBERTa and augmenting the dataset. We achieve performance gain by 3.2% in the iSarcasm dataset when using data augmentation to increase 20% of data labeled as sarcastic.
arXiv Detail & Related papers (2021-08-23T04:24:12Z)
Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples. We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models. We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z)
Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles [66.15398165275926]
We propose a method that can automatically detect and ignore dataset-specific patterns, which we call dataset biases. Our method trains a lower capacity model in an ensemble with a higher capacity model. We show improvement in all settings, including a 10 point gain on the visual question answering dataset.
arXiv Detail & Related papers (2020-11-07T22:20:03Z)
Trawling for Trolling: A Dataset [56.1778095945542]
We present a dataset that models trolling as a subcategory of offensive content. The dataset has 12,490 samples, split across 5 classes; Normal, Profanity, Trolling, Derogatory and Hate Speech.
arXiv Detail & Related papers (2020-08-02T17:23:55Z)
Sarcasm Detection using Context Separators in Online Discourse [3.655021726150369]
Sarcasm is an intricate form of speech, where meaning is conveyed implicitly. In this work, we use RoBERTa_large to detect sarcasm in two datasets. We also assert the importance of context in improving the performance of contextual word embedding models.
arXiv Detail & Related papers (2020-06-01T10:52:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.