Arabic Text Sentiment Analysis: Reinforcing Human-Performed Surveys with
Wider Topic Analysis
- URL: http://arxiv.org/abs/2403.01921v1
- Date: Mon, 4 Mar 2024 10:37:48 GMT
- Title: Arabic Text Sentiment Analysis: Reinforcing Human-Performed Surveys with
Wider Topic Analysis
- Authors: Latifah Almurqren, Ryan Hodgson, Alexandra Cristea
- Abstract summary: The in-depth study manually analyses 133 ASA papers published in the English language between 2002 and 2020.
The main findings show the different approaches used for ASA: machine learning, lexicon-based and hybrid approaches.
There is a need to develop ASA tools that can be used in industry, as well as in academia, for Arabic text SA.
- Score: 49.1574468325115
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sentiment analysis (SA) has been, and is still, a thriving research area.
However, the task of Arabic sentiment analysis (ASA) is still underrepresented
in the body of research. This study offers the first in-depth and in-breadth
analysis of existing ASA studies of textual content and identifies their common
themes, domains of application, methods, approaches, technologies and
algorithms used. The in-depth study manually analyses 133 ASA papers published
in the English language between 2002 and 2020 from four academic databases
(SAGE, IEEE, Springer, WILEY) and from Google Scholar. The in-breadth study
uses modern, automatic machine learning techniques, such as topic modelling and
temporal analysis, on Open Access resources, to reinforce themes and trends
identified by the prior study, on 2297 ASA publications between 2010-2020. The
main findings show the different approaches used for ASA: machine learning,
lexicon-based and hybrid approaches. Other findings include ASA 'winning'
algorithms (SVM, NB, hybrid methods). Deep learning methods, such as LSTM can
provide higher accuracy, but for ASA sometimes the corpora are not large enough
to support them. Additionally, whilst there are some ASA corpora and lexicons,
more are required. Specifically, Arabic tweets corpora and datasets are
currently only moderately sized. Moreover, Arabic lexicons that have high
coverage contain only Modern Standard Arabic (MSA) words, and those with Arabic
dialects are quite small. Thus, new corpora need to be created. On the other
hand, ASA tools are stringently lacking. There is a need to develop ASA tools
that can be used in industry, as well as in academia, for Arabic text SA.
Hence, our study offers insights into the challenges associated with ASA
research and provides suggestions for ways to move the field forward such as
lack of Dialectical Arabic resource, Arabic tweets, corpora and data sets for
SA.
Related papers
- ROAST: Review-level Opinion Aspect Sentiment Target Joint Detection for ABSA [50.90538760832107]
This research presents a novel task, Review-Level Opinion Aspect Sentiment Target (ROAST)
ROAST seeks to close the gap between sentence-level and text-level ABSA by identifying every ABSA constituent at the review level.
We extend the available datasets to enable ROAST, addressing the drawbacks noted in previous research.
arXiv Detail & Related papers (2024-05-30T17:29:15Z) - Arabic Sentiment Analysis with Noisy Deep Explainable Model [48.22321420680046]
This paper proposes an explainable sentiment classification framework for the Arabic language.
The proposed framework can explain specific predictions by training a local surrogate explainable model.
We carried out experiments on public benchmark Arabic SA datasets.
arXiv Detail & Related papers (2023-09-24T19:26:53Z) - GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training
Data Exploration [97.68234051078997]
We discuss how Pyserini can be integrated with the Hugging Face ecosystem of open-source AI libraries and artifacts.
We include a Jupyter Notebook-based walk through the core interoperability features, available on GitHub.
We present GAIA Search - a search engine built following previously laid out principles, giving access to four popular large-scale text collections.
arXiv Detail & Related papers (2023-06-02T12:09:59Z) - Exploring Sentiment Analysis Techniques in Natural Language Processing:
A Comprehensive Review [0.15229257192293202]
Sentiment analysis (SA) is the automated process of detecting and understanding the emotions conveyed through written text.
SA has gained significant popularity in the field of Natural Language Processing (NLP)
This study aims to enhance the efficiency and accuracy of SA processes, leading to smoother and error-free outcomes.
arXiv Detail & Related papers (2023-05-24T07:48:41Z) - H-AES: Towards Automated Essay Scoring for Hindi [33.755800922763946]
We reproduce and compare state-of-the-art methods for Automated Essay Scoring (AES) in the Hindi domain.
We employ classical feature-based Machine Learning (ML) and advanced end-to-end models, including LSTM Networks and Fine-Tuned Transformer Architecture.
We train and evaluate our models using translated English essays and empirically measure their performance on our own small-scale, real-world Hindi corpus.
arXiv Detail & Related papers (2023-02-28T15:14:15Z) - Survey of Aspect-based Sentiment Analysis Datasets [55.61047894397937]
Aspect-based sentiment analysis (ABSA) is a natural language processing problem that requires analyzing user-generated reviews.
Numerous yet scattered corpora for ABSA make it difficult for researchers to identify corpora best suited for a specific ABSA subtask quickly.
This study aims to present a database of corpora that can be used to train and assess autonomous ABSA systems.
arXiv Detail & Related papers (2022-04-11T16:23:36Z) - MACRONYM: A Large-Scale Dataset for Multilingual and Multi-Domain
Acronym Extraction [66.60031336330547]
Acronyms and their expanded forms are necessary for various NLP applications.
One limitation of existing AE research is that they are limited to the English language and certain domains.
Lacking annotated datasets in multiple languages and domains has been a major issue to hinder research in this area.
arXiv Detail & Related papers (2022-02-19T23:08:38Z) - Pre-trained Transformer-Based Approach for Arabic Question Answering : A
Comparative Study [0.5801044612920815]
We evaluate the state-of-the-art pre-trained transformers models for Arabic QA using four reading comprehension datasets.
We fine-tuned and compared the performance of the AraBERTv2-base model, AraBERTv0.2-large model, and AraELECTRA model.
arXiv Detail & Related papers (2021-11-10T12:33:18Z) - Sentiment Analysis in Poems in Misurata Sub-dialect -- A Sentiment
Detection in an Arabic Sub-dialect [0.0]
This study focuses on detecting sentiment in poems written in Misurata Arabic sub-dialect spoken in Libya.
The tools used to detect sentiment from the dataset are Sklearn as well as Mazajak sentiment tool 1.
arXiv Detail & Related papers (2021-09-15T10:42:39Z) - Arabic aspect based sentiment analysis using BERT [0.0]
This article explores the modeling capabilities of contextual embeddings from pre-trained language models, such as BERT.
We are building a simple but effective BERT-based neural baseline to handle this task.
Our BERT architecture with a simple linear classification layer surpassed the state-of-the-art works, according to the experimental results.
arXiv Detail & Related papers (2021-07-28T11:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.