Negation Handling in Machine Learning-Based Sentiment Classification for
Colloquial Arabic
- URL: http://arxiv.org/abs/2107.11597v1
- Date: Sat, 24 Jul 2021 13:12:37 GMT
- Title: Negation Handling in Machine Learning-Based Sentiment Classification for
Colloquial Arabic
- Authors: Omar Al-Harbi
- Abstract summary: The role of negation in Arabic sentiment analysis has been explored only to a limited extent, especially for colloquial Arabic.
We propose a simple rule-based algorithm for handling the problem; the rules were crafted based on observing many cases of negation.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: One crucial aspect of sentiment analysis is negation handling, where the
occurrence of negation can flip the sentiment of a sentence and negatively
affects the machine learning-based sentiment classification. The role of
negation in Arabic sentiment analysis has been explored only to a limited
extent, especially for colloquial Arabic. In this paper, the author addresses
the negation problem of machine learning-based sentiment classification for a
colloquial Arabic language. To this end, we propose a simple rule-based
algorithm for handling the problem; the rules were crafted based on observing
many cases of negation. Additionally, simple linguistic knowledge and sentiment
lexicon are used for this purpose. The author also examines the impact of the
proposed algorithm on the performance of different machine learning algorithms.
The results given by the proposed algorithm are compared with three baseline
models. The experimental results show that there is a positive impact on the
classifiers accuracy, precision and recall when the proposed algorithm is used
compared to the baselines.
Related papers
- On the Proper Treatment of Tokenization in Psycholinguistics [53.960910019072436]
The paper argues that token-level language models should be marginalized into character-level language models before they are used in psycholinguistic studies.
We find various focal areas whose surprisal is a better psychometric predictor than the surprisal of the region of interest itself.
arXiv Detail & Related papers (2024-10-03T17:18:03Z) - Ensemble of pre-trained language models and data augmentation for hate speech detection from Arabic tweets [0.27309692684728604]
We propose a novel approach that leverages ensemble learning and semi-supervised learning based on previously manually labeled.
We conducted experiments on a benchmark dataset by classifying Arabic tweets into 5 distinct classes: non-hate, general hate, racial, religious, or sexism.
arXiv Detail & Related papers (2024-07-02T17:26:26Z) - Revisiting subword tokenization: A case study on affixal negation in large language models [57.75279238091522]
We measure the impact of affixal negation on modern English large language models (LLMs)
We conduct experiments using LLMs with different subword tokenization methods.
We show that models can, on the whole, reliably recognize the meaning of affixal negation.
arXiv Detail & Related papers (2024-04-03T03:14:27Z) - Analyzing Cognitive Plausibility of Subword Tokenization [9.510439539246846]
Subword tokenization has become the de-facto standard for tokenization.
We present a new evaluation paradigm that focuses on the cognitive plausibility of subword tokenization.
arXiv Detail & Related papers (2023-10-20T08:25:37Z) - Arabic Sentiment Analysis with Noisy Deep Explainable Model [48.22321420680046]
This paper proposes an explainable sentiment classification framework for the Arabic language.
The proposed framework can explain specific predictions by training a local surrogate explainable model.
We carried out experiments on public benchmark Arabic SA datasets.
arXiv Detail & Related papers (2023-09-24T19:26:53Z) - Understanding and Mitigating Spurious Correlations in Text
Classification with Neighborhood Analysis [69.07674653828565]
Machine learning models have a tendency to leverage spurious correlations that exist in the training set but may not hold true in general circumstances.
In this paper, we examine the implications of spurious correlations through a novel perspective called neighborhood analysis.
We propose a family of regularization methods, NFL (doN't Forget your Language) to mitigate spurious correlations in text classification.
arXiv Detail & Related papers (2023-05-23T03:55:50Z) - A Semantic Approach to Negation Detection and Word Disambiguation with
Natural Language Processing [1.0499611180329804]
This study aims to demonstrate the methods for detecting negations in a sentence by uniquely evaluating the lexical structure of the text.
The proposed method examined all the unique features of the related expressions within a text to resolve the contextual usage of the sentence.
arXiv Detail & Related papers (2023-02-05T03:58:45Z) - Enhanced Aspect-Based Sentiment Analysis Models with Progressive
Self-supervised Attention Learning [103.0064298630794]
In aspect-based sentiment analysis (ABSA), many neural models are equipped with an attention mechanism to quantify the contribution of each context word to sentiment prediction.
We propose a progressive self-supervised attention learning approach for attentional ABSA models.
We integrate the proposed approach into three state-of-the-art neural ABSA models.
arXiv Detail & Related papers (2021-03-05T02:50:05Z) - Effect of Word Embedding Variable Parameters on Arabic Sentiment
Analysis Performance [0.0]
Social media such as Twitter, Facebook, etc. has led to a generated growing number of comments that contains users opinions.
This study will discuss three parameters (Window size, Dimension of vector and Negative Sample) for Arabic sentiment analysis.
Four binary classifiers (Logistic Regression, Decision Tree, Support Vector Machine and Naive Bayes) are used to detect sentiment.
arXiv Detail & Related papers (2021-01-08T08:31:00Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.