BanglaSarc: A Dataset for Sarcasm Detection
- URL: http://arxiv.org/abs/2209.13461v1
- Date: Tue, 27 Sep 2022 15:28:21 GMT
- Title: BanglaSarc: A Dataset for Sarcasm Detection
- Authors: Tasnim Sakib Apon, Ramisa Anan, Elizabeth Antora Modhu, Arjun Suter,
Ifrit Jamal Sneha, MD. Golam Rabiul Alam
- Abstract summary: Sarcasm is a positive statement or remark with an underlying negative motivation that is extensively employed in today's social media platforms.
There has been a significant improvement in sarcasm detection in English over the previous many years, however the situation regarding Bangla sarcasm detection remains unchanged.
This article proposes BanglaSarc, a dataset constructed specifically for bangla textual data sarcasm detection.
- Score: 0.3914676152740142
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Being one of the most widely spoken language in the world, the use of Bangla
has been increasing in the world of social media as well. Sarcasm is a positive
statement or remark with an underlying negative motivation that is extensively
employed in today's social media platforms. There has been a significant
improvement in sarcasm detection in English over the previous many years,
however the situation regarding Bangla sarcasm detection remains unchanged. As
a result, it is still difficult to identify sarcasm in bangla, and a lack of
high-quality data is a major contributing factor. This article proposes
BanglaSarc, a dataset constructed specifically for bangla textual data sarcasm
detection. This dataset contains of 5112 comments/status and contents collected
from various online social platforms such as Facebook, YouTube, along with a
few online blogs. Due to the limited amount of data collection of categorized
comments in Bengali, this dataset will aid in the of study identifying sarcasm,
recognizing people's emotion, detecting various types of Bengali expressions,
and other domains. The dataset is publicly available at
https://www.kaggle.com/datasets/sakibapon/banglasarc.
Related papers
- KoCoSa: Korean Context-aware Sarcasm Detection Dataset [3.369750569233713]
Sarcasm is a way of verbal irony where someone says the opposite of what they mean, often to ridicule a person, situation, or idea.
In this paper, we introduce a new dataset for the Korean dialogue sarcasm detection task, KoCoSa.
The dataset consists of 12.8K daily Korean dialogues and the labels for this task on the last response.
arXiv Detail & Related papers (2024-02-22T10:17:57Z) - Sarcasm Detection in a Disaster Context [103.93691731605163]
We introduce HurricaneSARC, a dataset of 15,000 tweets annotated for intended sarcasm.
Our best model is able to obtain as much as 0.70 F1 on our dataset.
arXiv Detail & Related papers (2023-08-16T05:58:12Z) - Researchers eye-view of sarcasm detection in social media textual
content [0.0]
Enormous use of sarcastic text in all forms of communication in social media will have a physiological effect on target users.
This paper discusses various sarcasm detection techniques and concludes with some approaches, related datasets with optimal features.
arXiv Detail & Related papers (2023-04-17T19:45:10Z) - Interpretable Bangla Sarcasm Detection using BERT and Explainable AI [0.3914676152740142]
A BERT-based system can achieve 99.60% while the utilized traditional machine learning algorithms are only capable of achieving 89.93%.
This dataset consists of fresh records of sarcastic and non-sarcastic comments, the majority of which are acquired from Facebook and YouTube comment sections.
arXiv Detail & Related papers (2023-03-22T17:35:35Z) - SODA: Million-scale Dialogue Distillation with Social Commonsense
Contextualization [129.1927527781751]
We present SODA, the first publicly available, million-scale high-quality social dialogue dataset.
By contextualizing social commonsense knowledge from a knowledge graph, we are able to distill an exceptionally broad spectrum of social interactions.
Human evaluation shows that conversations in SODA are more consistent, specific, and (surprisingly) natural than those in prior human-authored datasets.
arXiv Detail & Related papers (2022-12-20T17:38:47Z) - Sarcasm Detection Framework Using Emotion and Sentiment Features [62.997667081978825]
We propose a model which incorporates emotion and sentiment features to capture the incongruity intrinsic to sarcasm.
Our approach achieved state-of-the-art results on four datasets from social networking platforms and online media.
arXiv Detail & Related papers (2022-11-23T15:14:44Z) - How to Describe Images in a More Funny Way? Towards a Modular Approach
to Cross-Modal Sarcasm Generation [62.89586083449108]
We study a new problem of cross-modal sarcasm generation (CMSG), i.e., generating a sarcastic description for a given image.
CMSG is challenging as models need to satisfy the characteristics of sarcasm, as well as the correlation between different modalities.
We propose an Extraction-Generation-Ranking based Modular method (EGRM) for cross-model sarcasm generation.
arXiv Detail & Related papers (2022-11-20T14:38:24Z) - sarcasm detection and quantification in arabic tweets [7.173484352846755]
This paper intends to create a new humanly annotated Arabic corpus for sarcasm detection collected from tweets.
The proposed approach tackles the problem as a regression problem instead of classification.
arXiv Detail & Related papers (2021-08-03T11:48:27Z) - Bangla Text Dataset and Exploratory Analysis for Online Harassment
Detection [0.0]
The data that has been made accessible in this article has been gathered and marked from the comments of people in public posts by celebrities, government officials, athletes on Facebook.
The dataset is compiled with the aim of developing the ability of machines to differentiate whether a comment is a bully expression or not.
arXiv Detail & Related papers (2021-02-04T08:35:18Z) - Trawling for Trolling: A Dataset [56.1778095945542]
We present a dataset that models trolling as a subcategory of offensive content.
The dataset has 12,490 samples, split across 5 classes; Normal, Profanity, Trolling, Derogatory and Hate Speech.
arXiv Detail & Related papers (2020-08-02T17:23:55Z) - $R^3$: Reverse, Retrieve, and Rank for Sarcasm Generation with
Commonsense Knowledge [51.70688120849654]
We propose an unsupervised approach for sarcasm generation based on a non-sarcastic input sentence.
Our method employs a retrieve-and-edit framework to instantiate two major characteristics of sarcasm.
arXiv Detail & Related papers (2020-04-28T02:30:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.