Related papers: KoCoSa: Korean Context-aware Sarcasm Detection Dataset

KoCoSa: Korean Context-aware Sarcasm Detection Dataset

URL: http://arxiv.org/abs/2402.14428v2
Date: Fri, 22 Mar 2024 06:29:26 GMT
Title: KoCoSa: Korean Context-aware Sarcasm Detection Dataset
Authors: Yumin Kim, Heejae Suh, Mingi Kim, Dongyeon Won, Hwanhee Lee,
Abstract summary: Sarcasm is a way of verbal irony where someone says the opposite of what they mean, often to ridicule a person, situation, or idea. In this paper, we introduce a new dataset for the Korean dialogue sarcasm detection task, KoCoSa. The dataset consists of 12.8K daily Korean dialogues and the labels for this task on the last response.
Score: 3.369750569233713
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sarcasm is a way of verbal irony where someone says the opposite of what they mean, often to ridicule a person, situation, or idea. It is often difficult to detect sarcasm in the dialogue since detecting sarcasm should reflect the context (i.e., dialogue history). In this paper, we introduce a new dataset for the Korean dialogue sarcasm detection task, KoCoSa (Korean Context-aware Sarcasm Detection Dataset), which consists of 12.8K daily Korean dialogues and the labels for this task on the last response. To build the dataset, we propose an efficient sarcasm detection dataset generation pipeline: 1) generating new sarcastic dialogues from source dialogues with large language models, 2) automatic and manual filtering of abnormal and toxic dialogues, and 3) human annotation for the sarcasm detection task. We also provide a simple but effective baseline for the Korean sarcasm detection task trained on our dataset. Experimental results on the dataset show that our baseline system outperforms strong baselines like large language models, such as GPT-3.5, in the Korean sarcasm detection task. We show that the sarcasm detection task relies deeply on the existence of sufficient context. We will release the dataset at https://github.com/Yu-billie/KoCoSa_sarcasm_detection.

Related papers

Leveraging Large Language Models for Sarcastic Speech Annotation in Sarcasm Detection [16.35106164874197]
Sarcasm fundamentally alters meaning through tone and context, yet detecting it in speech remains a challenge due to data scarcity.<n>We propose an annotation pipeline that leverages large language models (LLMs) to generate a sarcasm dataset.<n>We validate this approach by comparing annotation quality and detection performance on a publicly available sarcasm dataset.<n>Finally, we introduce PodSarc, a large-scale sarcastic speech dataset created through this pipeline.
arXiv Detail & Related papers (2025-06-01T11:00:18Z)
Sentiment-enhanced Graph-based Sarcasm Explanation in Dialogue [67.09698638709065]
We propose a novel sEntiment-enhanceD Graph-based multimodal sarcasm Explanation framework, named EDGE. In particular, we first propose a lexicon-guided utterance sentiment inference module, where a utterance sentiment refinement strategy is devised. We then develop a module named Joint Cross Attention-based Sentiment Inference (JCA-SI) by extending the multimodal sentiment analysis model JCA to derive the joint sentiment label for each video-audio clip.
arXiv Detail & Related papers (2024-02-06T03:14:46Z)
Sarcasm Detection in a Disaster Context [103.93691731605163]
We introduce HurricaneSARC, a dataset of 15,000 tweets annotated for intended sarcasm. Our best model is able to obtain as much as 0.70 F1 on our dataset.
arXiv Detail & Related papers (2023-08-16T05:58:12Z)
Researchers eye-view of sarcasm detection in social media textual content [0.0]
Enormous use of sarcastic text in all forms of communication in social media will have a physiological effect on target users. This paper discusses various sarcasm detection techniques and concludes with some approaches, related datasets with optimal features.
arXiv Detail & Related papers (2023-04-17T19:45:10Z)
Sarcasm Detection Framework Using Emotion and Sentiment Features [62.997667081978825]
We propose a model which incorporates emotion and sentiment features to capture the incongruity intrinsic to sarcasm. Our approach achieved state-of-the-art results on four datasets from social networking platforms and online media.
arXiv Detail & Related papers (2022-11-23T15:14:44Z)
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation [62.89586083449108]
We study a new problem of cross-modal sarcasm generation (CMSG), i.e., generating a sarcastic description for a given image. CMSG is challenging as models need to satisfy the characteristics of sarcasm, as well as the correlation between different modalities. We propose an Extraction-Generation-Ranking based Modular method (EGRM) for cross-model sarcasm generation.
arXiv Detail & Related papers (2022-11-20T14:38:24Z)
Computational Sarcasm Analysis on Social Media: A Systematic Review [0.23488056916440855]
Sarcasm can be defined as saying or writing the opposite of what one truly wants to express, usually to insult, irritate, or amuse someone. Because of the obscure nature of sarcasm in textual data, detecting it is difficult and of great interest to the sentiment analysis research community.
arXiv Detail & Related papers (2022-09-13T17:20:19Z)
sarcasm detection and quantification in arabic tweets [7.173484352846755]
This paper intends to create a new humanly annotated Arabic corpus for sarcasm detection collected from tweets. The proposed approach tackles the problem as a regression problem instead of classification.
arXiv Detail & Related papers (2021-08-03T11:48:27Z)
Parallel Deep Learning-Driven Sarcasm Detection from Pop Culture Text and English Humor Literature [0.76146285961466]
We manually extract the sarcastic word distribution features of a benchmark pop culture sarcasm corpus. We generate input sequences formed of the weighted vectors from such words. Our proposed model for detecting sarcasm peaks a training accuracy of 98.95% when trained with the discussed dataset.
arXiv Detail & Related papers (2021-06-10T14:01:07Z)
Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context [55.898436183096614]
We present a novel data augmentation technique, CRA (Contextual Response Augmentation), which utilizes conversational context to generate meaningful samples for training. Specifically, our proposed model, trained with the proposed data augmentation technique, participated in the sarcasm detection task of FigLang2020, have won and achieves the best performance in both Reddit and Twitter datasets.
arXiv Detail & Related papers (2020-06-11T09:00:11Z)
Sarcasm Detection using Context Separators in Online Discourse [3.655021726150369]
Sarcasm is an intricate form of speech, where meaning is conveyed implicitly. In this work, we use RoBERTa_large to detect sarcasm in two datasets. We also assert the importance of context in improving the performance of contextual word embedding models.
arXiv Detail & Related papers (2020-06-01T10:52:35Z)
$R^3$: Reverse, Retrieve, and Rank for Sarcasm Generation with Commonsense Knowledge [51.70688120849654]
We propose an unsupervised approach for sarcasm generation based on a non-sarcastic input sentence. Our method employs a retrieve-and-edit framework to instantiate two major characteristics of sarcasm.
arXiv Detail & Related papers (2020-04-28T02:30:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.