ArmanEmo: A Persian Dataset for Text-based Emotion Detection
- URL: http://arxiv.org/abs/2207.11808v1
- Date: Sun, 24 Jul 2022 20:35:23 GMT
- Title: ArmanEmo: A Persian Dataset for Text-based Emotion Detection
- Authors: Hossein Mirzaee (1), Javad Peymanfard (2), Hamid Habibzadeh Moshtaghin
(3), Hossein Zeinali (1) ((1) Amirkabir University of Technology, (2) Iran
University of Science and Technology, (3) Allameh Tabataba'i University)
- Abstract summary: ArmanEmo is a human-labeled dataset of more than 7000 Persian sentences labeled for seven categories.
Labels are based on Ekman's six basic emotions.
Our best model achieves a macro-averaged F1 score of 75.39 percent across our test dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the recent proliferation of open textual data on social media platforms,
Emotion Detection (ED) from Text has received more attention over the past
years. It has many applications, especially for businesses and online service
providers, where emotion detection techniques can help them make informed
commercial decisions by analyzing customers/users' feelings towards their
products and services. In this study, we introduce ArmanEmo, a human-labeled
emotion dataset of more than 7000 Persian sentences labeled for seven
categories. The dataset has been collected from different resources, including
Twitter, Instagram, and Digikala (an Iranian e-commerce company) comments.
Labels are based on Ekman's six basic emotions (Anger, Fear, Happiness, Hatred,
Sadness, Wonder) and another category (Other) to consider any other emotion not
included in Ekman's model. Along with the dataset, we have provided several
baseline models for emotion classification focusing on the state-of-the-art
transformer-based language models. Our best model achieves a macro-averaged F1
score of 75.39 percent across our test dataset. Moreover, we also conduct
transfer learning experiments to compare our proposed dataset's generalization
against other Persian emotion datasets. Results of these experiments suggest
that our dataset has superior generalizability among the existing Persian
emotion datasets. ArmanEmo is publicly available for non-commercial use at
https://github.com/Arman-Rayan-Sharif/arman-text-emotion.
Related papers
- Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset [74.74686464187474]
Emotion and Intent Joint Understanding in Multimodal Conversation (MC-EIU) aims to decode the semantic information manifested in a multimodal conversational history.
MC-EIU is enabling technology for many human-computer interfaces.
We propose an MC-EIU dataset, which features 7 emotion categories, 9 intent categories, 3 modalities, i.e., textual, acoustic, and visual content, and two languages, English and Mandarin.
arXiv Detail & Related papers (2024-07-03T01:56:00Z) - Language Models (Mostly) Do Not Consider Emotion Triggers When Predicting Emotion [87.18073195745914]
We investigate how well human-annotated emotion triggers correlate with features deemed salient in their prediction of emotions.
Using EmoTrigger, we evaluate the ability of large language models to identify emotion triggers.
Our analysis reveals that emotion triggers are largely not considered salient features for emotion prediction models, instead there is intricate interplay between various features and the task of emotion detection.
arXiv Detail & Related papers (2023-11-16T06:20:13Z) - WEARS: Wearable Emotion AI with Real-time Sensor data [0.8740570557632509]
We propose a system to predict user emotion using smartwatch sensors.
We design a framework to collect ground truth in real-time utilizing a mix of English and regional language-based videos.
We also did an ablation study to understand the impact of features including Heart Rate, Accelerometer, and Gyroscope sensor data on mood.
arXiv Detail & Related papers (2023-08-22T11:03:00Z) - LEIA: Linguistic Embeddings for the Identification of Affect [0.23848027137382474]
We present LEIA, a model for emotion identification in text that has been trained on a dataset of more than 6 million posts.
LEIA is based on a word masking method that enhances the learning of emotion words during model pre-training.
Our results show that LEIA generalizes its classification of anger, happiness, and sadness beyond the domain it was trained on.
arXiv Detail & Related papers (2023-04-21T14:17:10Z) - Persian Emotion Detection using ParsBERT and Imbalanced Data Handling
Approaches [0.0]
EmoPars and ArmanEmo are two new human-labeled emotion datasets for the Persian language.
We evaluate EmoPars and compare them with ArmanEmo.
Our model reaches a Macro-averaged F1-score of 0.81 and 0.76 on ArmanEmo and EmoPars, respectively.
arXiv Detail & Related papers (2022-11-15T10:22:49Z) - MAFW: A Large-scale, Multi-modal, Compound Affective Database for
Dynamic Facial Expression Recognition in the Wild [56.61912265155151]
We propose MAFW, a large-scale compound affective database with 10,045 video-audio clips in the wild.
Each clip is annotated with a compound emotional category and a couple of sentences that describe the subjects' affective behaviors in the clip.
For the compound emotion annotation, each clip is categorized into one or more of the 11 widely-used emotions, i.e., anger, disgust, fear, happiness, neutral, sadness, surprise, contempt, anxiety, helplessness, and disappointment.
arXiv Detail & Related papers (2022-08-01T13:34:33Z) - It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image
Captioning by Contrastive Data Collection [22.209744124318966]
ArtEmis was recently introduced as a large-scale dataset of emotional reactions to images along with language explanations.
We observed a significant emotional bias towards instance-rich emotions, making trained neural speakers less accurate in describing under-represented emotions.
We propose a contrastive data collection approach to balance ArtEmis with a new complementary dataset.
arXiv Detail & Related papers (2022-04-15T22:08:45Z) - Chat-Capsule: A Hierarchical Capsule for Dialog-level Emotion Analysis [70.98130990040228]
We propose a Context-based Hierarchical Attention Capsule(Chat-Capsule) model, which models both utterance-level and dialog-level emotions and their interrelations.
On a dialog dataset collected from customer support of an e-commerce platform, our model is also able to predict user satisfaction and emotion curve category.
arXiv Detail & Related papers (2022-03-23T08:04:30Z) - Affective Image Content Analysis: Two Decades Review and New
Perspectives [132.889649256384]
We will comprehensively review the development of affective image content analysis (AICA) in the recent two decades.
We will focus on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence.
We discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.
arXiv Detail & Related papers (2021-06-30T15:20:56Z) - A Circular-Structured Representation for Visual Emotion Distribution
Learning [82.89776298753661]
We propose a well-grounded circular-structured representation to utilize the prior knowledge for visual emotion distribution learning.
To be specific, we first construct an Emotion Circle to unify any emotional state within it.
On the proposed Emotion Circle, each emotion distribution is represented with an emotion vector, which is defined with three attributes.
arXiv Detail & Related papers (2021-06-23T14:53:27Z) - GoEmotions: A Dataset of Fine-Grained Emotions [16.05879383442812]
We introduce GoEmotions, the largest manually annotated dataset of 58k English Reddit comments, labeled for 27 emotion categories or Neutral.
Our BERT-based model achieves an average F1-score of.46 across our proposed taxonomy, leaving much room for improvement.
arXiv Detail & Related papers (2020-05-01T18:00:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.