Persian Emotion Detection using ParsBERT and Imbalanced Data Handling
Approaches
- URL: http://arxiv.org/abs/2211.08029v2
- Date: Thu, 17 Nov 2022 12:13:11 GMT
- Title: Persian Emotion Detection using ParsBERT and Imbalanced Data Handling
Approaches
- Authors: Amirhossein Abaskohi, Nazanin Sabri, Behnam Bahrak
- Abstract summary: EmoPars and ArmanEmo are two new human-labeled emotion datasets for the Persian language.
We evaluate EmoPars and compare them with ArmanEmo.
Our model reaches a Macro-averaged F1-score of 0.81 and 0.76 on ArmanEmo and EmoPars, respectively.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Emotion recognition is one of the machine learning applications which can be
done using text, speech, or image data gathered from social media spaces.
Detecting emotion can help us in different fields, including opinion mining.
With the spread of social media, different platforms like Twitter have become
data sources, and the language used in these platforms is informal, making the
emotion detection task difficult. EmoPars and ArmanEmo are two new
human-labeled emotion datasets for the Persian language. These datasets,
especially EmoPars, are suffering from inequality between several samples
between two classes. In this paper, we evaluate EmoPars and compare them with
ArmanEmo. Throughout this analysis, we use data augmentation techniques, data
re-sampling, and class-weights with Transformer-based Pretrained Language
Models(PLMs) to handle the imbalance problem of these datasets. Moreover,
feature selection is used to enhance the models' performance by emphasizing the
text's specific features. In addition, we provide a new policy for selecting
data from EmoPars, which selects the high-confidence samples; as a result, the
model does not see samples that do not have specific emotion during training.
Our model reaches a Macro-averaged F1-score of 0.81 and 0.76 on ArmanEmo and
EmoPars, respectively, which are new state-of-the-art results in these
benchmarks.
Related papers
- EmoBench: Evaluating the Emotional Intelligence of Large Language Models [73.60839120040887]
EmoBench is a benchmark that draws upon established psychological theories and proposes a comprehensive definition for machine Emotional Intelligence (EI)
EmoBench includes a set of 400 hand-crafted questions in English and Chinese, which are meticulously designed to require thorough reasoning and understanding.
Our findings reveal a considerable gap between the EI of existing Large Language Models and the average human, highlighting a promising direction for future research.
arXiv Detail & Related papers (2024-02-19T11:48:09Z) - Emotion Rendering for Conversational Speech Synthesis with Heterogeneous
Graph-Based Context Modeling [50.99252242917458]
Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting.
To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity.
Our model outperforms the baseline models in understanding and rendering emotions.
arXiv Detail & Related papers (2023-12-19T08:47:50Z) - Language Models (Mostly) Do Not Consider Emotion Triggers When Predicting Emotion [87.18073195745914]
We investigate how well human-annotated emotion triggers correlate with features deemed salient in their prediction of emotions.
Using EmoTrigger, we evaluate the ability of large language models to identify emotion triggers.
Our analysis reveals that emotion triggers are largely not considered salient features for emotion prediction models, instead there is intricate interplay between various features and the task of emotion detection.
arXiv Detail & Related papers (2023-11-16T06:20:13Z) - Data Augmentation for Emotion Detection in Small Imbalanced Text Data [0.0]
One of the challenges is the shortage of available datasets that have been annotated with emotions.
We studied the impact of data augmentation techniques precisely when applied to small imbalanced datasets.
Our experimental results show that using the augmented data when training the classifier model leads to significant improvements.
arXiv Detail & Related papers (2023-10-25T21:29:36Z) - Reevaluating Data Partitioning for Emotion Detection in EmoWOZ [0.0]
EmoWoz is an extension of MultiWOZ that provides emotion labels for the dialogues.
MultiWOZ was partitioned initially for another purpose, resulting in a distributional shift when considering the new purpose of emotion recognition.
We propose a stratified sampling scheme based on emotion tags to address this issue, improve the dataset's distribution, and reduce dataset shift.
arXiv Detail & Related papers (2023-03-15T03:06:13Z) - Emotion Detection From Tweets Using a BERT and SVM Ensemble Model [0.0]
We investigate the use of Support Vector Machine and Bidirectional Representations from Transformers for emotion recognition.
We propose a novel ensemble model by combining the two BERT and SVM models.
Experiments show that the proposed model achieves a state-of-the-art accuracy of 0.91 on emotion recognition in tweets.
arXiv Detail & Related papers (2022-08-09T05:32:29Z) - ArmanEmo: A Persian Dataset for Text-based Emotion Detection [0.0]
ArmanEmo is a human-labeled dataset of more than 7000 Persian sentences labeled for seven categories.
Labels are based on Ekman's six basic emotions.
Our best model achieves a macro-averaged F1 score of 75.39 percent across our test dataset.
arXiv Detail & Related papers (2022-07-24T20:35:23Z) - DeepEmotex: Classifying Emotion in Text Messages using Deep Transfer
Learning [0.0]
We propose DeepEmotex an effective sequential transfer learning method to detect emotion in text.
We conduct an experimental study using both curated Twitter data sets and benchmark data sets.
DeepEmotex models achieve over 91% accuracy for multi-class emotion classification on test dataset.
arXiv Detail & Related papers (2022-06-12T03:23:40Z) - EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional
Text-to-Speech Model [56.75775793011719]
We introduce and publicly release a Mandarin emotion speech dataset including 9,724 samples with audio files and its emotion human-labeled annotation.
Unlike those models which need additional reference audio as input, our model could predict emotion labels just from the input text and generate more expressive speech conditioned on the emotion embedding.
In the experiment phase, we first validate the effectiveness of our dataset by an emotion classification task. Then we train our model on the proposed dataset and conduct a series of subjective evaluations.
arXiv Detail & Related papers (2021-06-17T08:34:21Z) - Affect2MM: Affective Analysis of Multimedia Content Using Emotion
Causality [84.69595956853908]
We present Affect2MM, a learning method for time-series emotion prediction for multimedia content.
Our goal is to automatically capture the varying emotions depicted by characters in real-life human-centric situations and behaviors.
arXiv Detail & Related papers (2021-03-11T09:07:25Z) - Modality-Transferable Emotion Embeddings for Low-Resource Multimodal
Emotion Recognition [55.44502358463217]
We propose a modality-transferable model with emotion embeddings to tackle the aforementioned issues.
Our model achieves state-of-the-art performance on most of the emotion categories.
Our model also outperforms existing baselines in the zero-shot and few-shot scenarios for unseen emotions.
arXiv Detail & Related papers (2020-09-21T06:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.