Related papers: Dynamic Sentiment Analysis with Local Large Language Models using Majority Voting: A Study on Factors Affecting Restaurant Evaluation

Dynamic Sentiment Analysis with Local Large Language Models using Majority Voting: A Study on Factors Affecting Restaurant Evaluation

URL: http://arxiv.org/abs/2407.13069v1
Date: Thu, 18 Jul 2024 00:28:04 GMT
Title: Dynamic Sentiment Analysis with Local Large Language Models using Majority Voting: A Study on Factors Affecting Restaurant Evaluation
Authors: Junichiro Niimi,
Abstract summary: This study introduces a majority voting mechanism to a sentiment analysis model using local language models. By a series of three analyses of online reviews on restaurant evaluations, we demonstrate that majority voting with multiple attempts produces more robust results than using a large model with a single attempt.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: User-generated contents (UGCs) on online platforms allow marketing researchers to understand consumer preferences for products and services. With the advance of large language models (LLMs), some studies utilized the models for annotation and sentiment analysis. However, the relationship between the accuracy and the hyper-parameters of LLMs is yet to be thoroughly examined. In addition, the issues of variability and reproducibility of results from each trial of LLMs have rarely been considered in existing literature. Since actual human annotation uses majority voting to resolve disagreements among annotators, this study introduces a majority voting mechanism to a sentiment analysis model using local LLMs. By a series of three analyses of online reviews on restaurant evaluations, we demonstrate that majority voting with multiple attempts using a medium-sized model produces more robust results than using a large model with a single attempt. Furthermore, we conducted further analysis to investigate the effect of each aspect on the overall evaluation.

Related papers

CLEAR: Error Analysis via LLM-as-a-Judge Made Easy [9.285203198113917]
We introduce CLEAR, an interactive, open-source package for LLM-based error analysis.<n> CLEAR first generates per-instance textual feedback, then creates a set of system-level error issues, and quantifies the prevalence of each identified issue.<n>Our package also provides users with an interactive dashboard that allows for a comprehensive error analysis through aggregate visualizations.
arXiv Detail & Related papers (2025-07-24T13:15:21Z)
A Simple Ensemble Strategy for LLM Inference: Towards More Stable Text Classification [0.0]
This study introduces the straightforward ensemble strategy to a sentiment analysis using large language models (LLMs) As the results, we demonstrate that the ensemble of multiple inference using medium-sized LLMs produces more robust and accurate results than using a large model with a single attempt with reducing RMSE by 18.6%.
arXiv Detail & Related papers (2025-04-26T10:10:26Z)
Enhancing LLM Evaluations: The Garbling Trick [0.0]
We propose a method to transform existing large language models (LLMs) evaluations into a series of progressively more difficult tasks.<n>These enhanced evaluations emphasize reasoning capabilities and can reveal relative performance differences that are not apparent in the original assessments.<n>Our results offer insights into the comparative abilities of these models, particularly highlighting the differences between base LLMs and more recent "reasoning" models.
arXiv Detail & Related papers (2024-11-03T11:39:50Z)
Diverging Preferences: When do Annotators Disagree and do Models Know? [92.24651142187989]
We develop a taxonomy of disagreement sources spanning 10 categories across four high-level classes. We find that the majority of disagreements are in opposition with standard reward modeling approaches. We develop methods for identifying diverging preferences to mitigate their influence on evaluation and training.
arXiv Detail & Related papers (2024-10-18T17:32:22Z)
Utilizing Large Language Models for Event Deconstruction to Enhance Multimodal Aspect-Based Sentiment Analysis [2.1329326061804816]
This paper introduces Large Language Models (LLMs) for event decomposition and proposes a reinforcement learning framework for Multimodal Aspect-based Sentiment Analysis (MABSA-RL) Experimental results show that MABSA-RL outperforms existing advanced methods on two benchmark datasets.
arXiv Detail & Related papers (2024-10-18T03:40:45Z)
Adversarial Multi-Agent Evaluation of Large Language Models through Iterative Debates [0.0]
We propose a framework that interprets large language models (LLMs) as advocates within an ensemble of interacting agents. This approach offers a more dynamic and comprehensive evaluation process compared to traditional human-based assessments or automated metrics.
arXiv Detail & Related papers (2024-10-07T00:22:07Z)
Examining Independence in Ensemble Sentiment Analysis: A Study on the Limits of Large Language Models Using the Condorcet Jury Theorem [0.0]
This paper explores the application of the Condorcet Jury theorem to the domain of sentiment analysis. Our empirical study tests this theoretical framework by implementing a majority vote mechanism across different models. Contrary to expectations, the results reveal only marginal improvements in performance when incorporating larger models.
arXiv Detail & Related papers (2024-08-26T14:04:00Z)
DnA-Eval: Enhancing Large Language Model Evaluation through Decomposition and Aggregation [75.81096662788254]
Large Language Models (LLMs) are scalable and economical evaluators. The question of how reliable these evaluators are has emerged as a crucial research question. We propose Decompose and Aggregate, which breaks down the evaluation process into different stages based on pedagogical practices.
arXiv Detail & Related papers (2024-05-24T08:12:30Z)
The Effectiveness of LLMs as Annotators: A Comparative Overview and Empirical Analysis of Direct Representation [5.249002650134171]
Large Language Models (LLMs) have emerged as powerful support tools across various natural language tasks and a range of application domains. This paper provides a comparative overview of twelve studies investigating the potential of LLMs in labelling data. While the models demonstrate promising cost and time-saving benefits, there exist considerable limitations, such as representativeness, bias, sensitivity to prompt variations and English language preference.
arXiv Detail & Related papers (2024-05-02T14:00:22Z)
Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) are used to automate decision-making tasks. In this paper, we evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention. We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types. These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts.
arXiv Detail & Related papers (2024-04-08T14:15:56Z)
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics [51.17512229589]
PoLLMgraph is a model-based white-box detection and forecasting approach for large language models. We show that hallucination can be effectively detected by analyzing the LLM's internal state transition dynamics. Our work paves a new way for model-based white-box analysis of LLMs, motivating the research community to further explore, understand, and refine the intricate dynamics of LLM behaviors.
arXiv Detail & Related papers (2024-04-06T20:02:20Z)
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models [31.426274932333264]
We present Comparator, a novel visual analytics tool for interactively analyzing results from automatic side-by-side evaluation. The tool supports interactive for users to understand when and why a model performs better or worse than a baseline model.
arXiv Detail & Related papers (2024-02-16T09:14:49Z)
On Diversified Preferences of Large Language Model Alignment [51.26149027399505]
This paper presents the first quantitative analysis of the experimental scaling law for reward models with varying sizes. Our analysis reveals that the impact of diversified human preferences depends on both model size and data size. Larger models with sufficient capacity mitigate the negative effects of diverse preferences, while smaller models struggle to accommodate them.
arXiv Detail & Related papers (2023-12-12T16:17:15Z)
BLESS: Benchmarking Large Language Models on Sentence Simplification [55.461555829492866]
We present BLESS, a performance benchmark of the most recent state-of-the-art large language models (LLMs) on the task of text simplification (TS) We assess a total of 44 models, differing in size, architecture, pre-training methods, and accessibility, on three test sets from different domains (Wikipedia, news, and medical) under a few-shot setting. Our evaluation indicates that the best LLMs, despite not being trained on TS, perform comparably with state-of-the-art TS baselines.
arXiv Detail & Related papers (2023-10-24T12:18:17Z)
Sentiment Analysis in the Era of Large Language Models: A Reality Check [69.97942065617664]
This paper investigates the capabilities of large language models (LLMs) in performing various sentiment analysis tasks. We evaluate performance across 13 tasks on 26 datasets and compare the results against small language models (SLMs) trained on domain-specific datasets.
arXiv Detail & Related papers (2023-05-24T10:45:25Z)
Sentiment Analysis Based on Deep Learning: A Comparative Study [69.09570726777817]
The study of public opinion can provide us with valuable information. The efficiency and accuracy of sentiment analysis is being hindered by the challenges encountered in natural language processing. This paper reviews the latest studies that have employed deep learning to solve sentiment analysis problems.
arXiv Detail & Related papers (2020-06-05T16:28:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.