Advancing Sentiment Analysis: A Novel LSTM Framework with Multi-head Attention
- URL: http://arxiv.org/abs/2503.08079v1
- Date: Tue, 11 Mar 2025 06:21:49 GMT
- Title: Advancing Sentiment Analysis: A Novel LSTM Framework with Multi-head Attention
- Authors: Jingyuan Yi, Peiyang Yu, Tianyi Huang, Xiaochuan Xu,
- Abstract summary: This work proposes an LSTM-based sentiment classification model with multi-head attention mechanism and TF-IDF optimization.<n> Experimental results on public data sets demonstrate that the new method achieves substantial improvements in the most critical metrics like accuracy, recall, and F1-score.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work proposes an LSTM-based sentiment classification model with multi-head attention mechanism and TF-IDF optimization. Through the integration of TF-IDF feature extraction and multi-head attention, the model significantly improves text sentiment analysis performance. Experimental results on public data sets demonstrate that the new method achieves substantial improvements in the most critical metrics like accuracy, recall, and F1-score compared to baseline models. Specifically, the model achieves an accuracy of 80.28% on the test set, which is improved by about 12% in comparison with standard LSTM models. Ablation experiments also support the necessity and necessity of all modules, in which the impact of multi-head attention is greatest to performance improvement. This research provides a proper approach to sentiment analysis, which can be utilized in public opinion monitoring, product recommendation, etc.
Related papers
- A Deep Learning Framework for Sequence Mining with Bidirectional LSTM and Multi-Scale Attention [11.999319439383918]
This paper addresses the challenges of mining latent patterns and modeling contextual dependencies in complex sequence data.
A sequence pattern mining algorithm is proposed by integrating Bidirectional Long Short-Term Memory (BiLSTM) with a multi-scale attention mechanism.
BiLSTM captures both forward and backward dependencies in sequences, enhancing the model's ability to perceive global contextual structures.
arXiv Detail & Related papers (2025-04-21T16:53:02Z) - Revitalizing Saturated Benchmarks: A Weighted Metric Approach for Differentiating Large Language Model Performance [3.666887868385651]
Existing benchmarks are saturated and struggle to separate model performances due to factors like data contamination.<n>This paper introduces theEnhanced Model Differentiation Metric, a novel weighted metric that revitalizes benchmarks by enhancing model separation.
arXiv Detail & Related papers (2025-03-07T16:25:09Z) - FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models [79.41859481668618]
Large Language Models (LLMs) have significantly advanced the fact-checking studies.
Existing automated fact-checking evaluation methods rely on static datasets and classification metrics.
We introduce FACT-AUDIT, an agent-driven framework that adaptively and dynamically assesses LLMs' fact-checking capabilities.
arXiv Detail & Related papers (2025-02-25T07:44:22Z) - CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory [38.32540433374892]
Large Language Models (LLMs) have achieved significant advancements, but the increasing complexity of tasks and higher performance demands highlight the need for continuous improvement.<n>Some approaches utilize synthetic data generated by advanced LLMs based on evaluation results to train models.<n>In this paper, we introduce the Cognitive Diagnostic Synthesis (CDS) method, which incorporates a diagnostic process inspired by Cognitive Diagnosis Theory (CDT) to refine evaluation results and characterize model profiles at the knowledge component level.
arXiv Detail & Related papers (2025-01-13T20:13:59Z) - EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation [58.546205554954454]
We propose Enhancing Alignment in MLLMs via Critical Observation (EACO)<n>EACO aligns MLLMs by self-generated preference data using only 5k images economically.<n>EACO reduces the overall hallucinations by 65.6% on HallusionBench and improves the reasoning ability by 21.8% on MME-Cognition.
arXiv Detail & Related papers (2024-12-06T09:59:47Z) - Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning [71.2981957820888]
We propose a novel Star-Agents framework, which automates the enhancement of data quality across datasets.
The framework initially generates diverse instruction data with multiple LLM agents through a bespoke sampling method.
The generated data undergo a rigorous evaluation using a dual-model method that assesses both difficulty and quality.
arXiv Detail & Related papers (2024-11-21T02:30:53Z) - MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations.
Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations.
We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z) - CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain
Performance and Calibration [59.48235003469116]
We show that data augmentation consistently enhances OOD performance.
We also show that CF augmented models which are easier to calibrate also exhibit much lower entropy when assigning importance.
arXiv Detail & Related papers (2023-09-14T16:16:40Z) - DiversiGATE: A Comprehensive Framework for Reliable Large Language
Models [2.616506436169964]
We introduce DiversiGATE, a unified framework that consolidates diverse methodologies for LLM verification.
We propose a novel SelfLearner' model that conforms to the DiversiGATE framework and refines its performance over time.
Our results demonstrate that our approach outperforms traditional LLMs, achieving a considerable 54.8% -> 61.8% improvement on the GSM8K benchmark.
arXiv Detail & Related papers (2023-06-22T22:29:40Z) - The DONUT Approach to EnsembleCombination Forecasting [0.0]
This paper presents an ensemble forecasting method that shows strong results on the M4Competition dataset.
Our assumption reductions, consisting mainly of auto-generated features and a more diverse model pool, significantly outperforms the statistical-feature-based ensemble method FFORMA.
We also present a formal ex-post-facto analysis of optimal combination and selection for ensembles, quantifying differences through linear optimization on the M4 dataset.
arXiv Detail & Related papers (2022-01-02T22:19:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.