Related papers: LLM Based Sentiment Classification From Bangladesh E-Commerce Reviews

LLM Based Sentiment Classification From Bangladesh E-Commerce Reviews

URL: http://arxiv.org/abs/2510.01276v1
Date: Tue, 30 Sep 2025 16:46:09 GMT
Title: LLM Based Sentiment Classification From Bangladesh E-Commerce Reviews
Authors: Sumaiya Tabassum,
Abstract summary: The viability of using transformer-based BERT models for sentiment analysis from Bangladesh e commerce reviews is investigated in this paper.<n>A subset of 4000 samples from the original dataset of Bangla and English customer reviews was utilized to fine-tune the model.<n>The fine tuned Llama-3.1-8B model outperformed other fine-tuned models, with an overall accuracy, precision, recall, and F1 score of 95.5%, 93%, 88%, 90%.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sentiment analysis is an essential part of text analysis, which is a larger field that includes determining and evaluating the author's emotional state. This method is essential since it makes it easier to comprehend consumers' feelings, viewpoints, and preferences holistically. The introduction of large language models (LLMs), such as Llama, has greatly increased the availability of cutting-edge model applications, such as sentiment analysis. However, accurate sentiment analysis is hampered by the intricacy of written language and the diversity of languages used in evaluations. The viability of using transformer-based BERT models and other LLMs for sentiment analysis from Bangladesh e commerce reviews is investigated in this paper. A subset of 4000 samples from the original dataset of Bangla and English customer reviews was utilized to fine-tune the model. The fine tuned Llama-3.1-8B model outperformed other fine-tuned models, including Phi-3.5-mini-instruct, Mistral-7B-v0.1, DistilBERT-multilingual, mBERT, and XLM-R-base, with an overall accuracy, precision, recall, and F1 score of 95.5%, 93%, 88%, 90%. The study emphasizes how parameter efficient fine-tuning methods (LoRA and PEFT) can lower computational overhead and make it appropriate for contexts with limited resources. The results show how LLMs can

Related papers

RefineBench: Evaluating Refinement Capability of Language Models via Checklists [71.02281792867531]
We evaluate two refinement modes: guided refinement and self-refinement.<n>In guided refinement, both proprietary LMs and large open-weight LMs can leverage targeted feedback to refine responses to near-perfect levels within five turns.<n>These findings suggest that frontier LMs require breakthroughs to self-refine their incorrect responses.
arXiv Detail & Related papers (2025-11-27T07:20:52Z)
HausaMovieReview: A Benchmark Dataset for Sentiment Analysis in Low-Resource African Language [1.3465808629549525]
This paper introduces a novel benchmark dataset comprising 5,000 YouTube comments in Hausa and code-switched English.<n>We use this dataset to conduct a comparative analysis of classical models and fine-tuned transformer models.<n>Our results reveal a key finding: the Decision Tree classifier, with an accuracy and F1-score 89.72% and 89.60% respectively, significantly outperformed the deep learning models.
arXiv Detail & Related papers (2025-09-17T22:57:21Z)
Reference Points in LLM Sentiment Analysis: The Role of Structured Context [0.0]
This study investigates how supplementary information affect sentiment analysis using large language models (LLMs)<n>We show that structured prompting can enable smaller models to achieve competitive performance.
arXiv Detail & Related papers (2025-08-15T13:04:32Z)
ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models [75.05436691700572]
We introduce ExpliCa, a new dataset for evaluating Large Language Models (LLMs) in explicit causal reasoning.<n>We tested seven commercial and open-source LLMs on ExpliCa through prompting and perplexity-based metrics.<n>Surprisingly, models tend to confound temporal relations with causal ones, and their performance is also strongly influenced by the linguistic order of the events.
arXiv Detail & Related papers (2025-02-21T14:23:14Z)
Enhancing Sentiment Analysis in Bengali Texts: A Hybrid Approach Using Lexicon-Based Algorithm and Pretrained Language Model Bangla-BERT [1.5020330976600738]
We develop a novel approach that integrates rule-based algorithms with pre-trained language models.<n>We developed a novel rule based algorithm Bangla Sentiment Polarity Score ( BSPS), an approach capable of generating sentiment scores and classifying reviews into nine distinct sentiment categories.<n>Our analysis revealed that the BSPS + BanglaBERT hybrid approach outperformed the standalone BanglaBERT model, achieving higher accuracy, precision, and nuanced classification across the nine sentiment categories.
arXiv Detail & Related papers (2024-11-29T09:57:11Z)
CLAIR-A: Leveraging Large Language Models to Judge Audio Captions [73.51087998971418]
evaluating machine-generated audio captions is a complex task that requires considering diverse factors.<n>We propose CLAIR-A, a simple and flexible method that leverages the zero-shot capabilities of large language models.<n>In our evaluations, CLAIR-A better predicts human judgements of quality compared to traditional metrics.
arXiv Detail & Related papers (2024-09-19T17:59:52Z)
DataComp-LM: In search of the next generation of training sets for language models [200.5293181577585]
DataComp for Language Models (DCLM) is a testbed for controlled dataset experiments with the goal of improving language models.<n>We provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations.<n>Participants in the DCLM benchmark can experiment with data curation strategies such as deduplication, filtering, and data mixing at model scales ranging from 412M to 7B parameters.
arXiv Detail & Related papers (2024-06-17T17:42:57Z)
Zero- and Few-Shot Prompting with LLMs: A Comparative Study with Fine-tuned Models for Bangla Sentiment Analysis [6.471458199049549]
In this study, we present a sizeable manually annotated dataset encompassing 33,606 Bangla news tweets and Facebook comments. We also investigate zero- and few-shot in-context learning with several language models, including Flan-T5, GPT-4, and Bloomz. Our findings suggest that monolingual transformer-based models consistently outperform other models, even in zero and few-shot scenarios.
arXiv Detail & Related papers (2023-08-21T15:19:10Z)
BanglaBook: A Large-scale Bangla Dataset for Sentiment Analysis from Book Reviews [1.869097450593631]
We present a large-scale dataset of Bangla book reviews consisting of 158,065 samples classified into three broad categories: positive, negative, and neutral. We employ a range of machine learning models to establish baselines including SVM, LSTM, and Bangla-BERT. Our findings demonstrate a substantial performance advantage of pre-trained models over models that rely on manually crafted features.
arXiv Detail & Related papers (2023-05-11T06:27:38Z)
Training Language Models with Language Feedback at Scale [50.70091340506957]
We introduce learning from Language Feedback (ILF), a new approach that utilizes more informative language feedback. ILF consists of three steps that are applied iteratively: first, conditioning the language model on the input, an initial LM output, and feedback to generate refinements. We show theoretically that ILF can be viewed as Bayesian Inference, similar to Reinforcement Learning from human feedback.
arXiv Detail & Related papers (2023-03-28T17:04:15Z)
Holistic Evaluation of Language Models [183.94891340168175]
Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models.
arXiv Detail & Related papers (2022-11-16T18:51:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.