Text Sentiment Analysis and Classification Based on Bidirectional Gated Recurrent Units (GRUs) Model
- URL: http://arxiv.org/abs/2404.17123v2
- Date: Wed, 12 Jun 2024 14:12:17 GMT
- Title: Text Sentiment Analysis and Classification Based on Bidirectional Gated Recurrent Units (GRUs) Model
- Authors: Wei Xu, Jianlong Chen, Zhicheng Ding, Jinyin Wang,
- Abstract summary: This paper explores the importance of text sentiment analysis and classification in the field of natural language processing.
It proposes a new approach to sentiment analysis and classification based on the bidirectional gated recurrent units (GRUs) model.
- Score: 6.096738978232722
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores the importance of text sentiment analysis and classification in the field of natural language processing, and proposes a new approach to sentiment analysis and classification based on the bidirectional gated recurrent units (GRUs) model. The study firstly analyses the word cloud model of the text with six sentiment labels, and then carries out data preprocessing, including the steps of removing special symbols, punctuation marks, numbers, stop words and non-alphabetic parts. Subsequently, the data set is divided into training set and test set, and through model training and testing, it is found that the accuracy of the validation set is increased from 85% to 93% with training, which is an increase of 8%; at the same time, the loss value of the validation set decreases from 0.7 to 0.1 and tends to be stable, and the model is gradually close to the actual value, which can effectively classify the text emotions. The confusion matrix shows that the accuracy of the model on the test set reaches 94.8%, the precision is 95.9%, the recall is 99.1%, and the F1 score is 97.4%, which proves that the model has good generalisation ability and classification effect. Overall, the study demonstrated an effective method for text sentiment analysis and classification with satisfactory results.
Related papers
- Optimizing Transformer based on high-performance optimizer for predicting employment sentiment in American social media content [9.49688045612671]
This article improves the Transformer model based on swarm intelligence optimization algorithm, aiming to predict the emotions of employment related text content on American social media.
During the training process, the accuracy of the model gradually increased from 49.27% to 82.83%, while the loss value decreased from 0.67 to 0.35.
The improved model proposed in this article not only improves the accuracy of sentiment recognition in employment related texts on social media, but also has important practical significance.
arXiv Detail & Related papers (2024-10-09T03:14:05Z) - Phrasing for UX: Enhancing Information Engagement through Computational Linguistics and Creative Analytics [0.0]
This study explores the relationship between textual features and Information Engagement (IE) on digital platforms.
It highlights the impact of computational linguistics and analytics on user interaction.
The READ model is introduced to quantify key predictors like representativeness, ease of use, affect, and distribution.
arXiv Detail & Related papers (2024-08-23T00:33:47Z) - AI-Generated Text Detection and Classification Based on BERT Deep Learning Algorithm [10.5960023194262]
This study develops an efficient AI-generated text detection model based on the BERT algorithm.
The accuracy increases steadily from the initial 94.78% to 99.72%, while the loss value decreases from 0.261 to 0.021 and converges gradually.
In terms of loss value, the average loss of the training set is 0.0565, while the average loss of the test set is 0.0917, showing a slightly higher loss value.
arXiv Detail & Related papers (2024-05-26T04:26:07Z) - Text Quality-Based Pruning for Efficient Training of Language Models [66.66259229732121]
We propose a novel method for numerically evaluating text quality in large unlabelled NLP datasets.
By proposing the text quality metric, the paper establishes a framework to identify and eliminate low-quality text instances.
Experimental results over multiple models and datasets demonstrate the efficacy of this approach.
arXiv Detail & Related papers (2024-04-26T18:01:25Z) - A Comprehensive Evaluation and Analysis Study for Chinese Spelling Check [53.152011258252315]
We show that using phonetic and graphic information reasonably is effective for Chinese Spelling Check.
Models are sensitive to the error distribution of the test set, which reflects the shortcomings of models.
The commonly used benchmark, SIGHAN, can not reliably evaluate models' performance.
arXiv Detail & Related papers (2023-07-25T17:02:38Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot
Text Classification Tasks [75.42002070547267]
We propose a self evolution learning (SE) based mixup approach for data augmentation in text classification.
We introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up.
arXiv Detail & Related papers (2023-05-22T23:43:23Z) - Presence of informal language, such as emoticons, hashtags, and slang,
impact the performance of sentiment analysis models on social media text? [0.0]
This study investigated the influence of informal language, such as emoticons and slang, on the performance of sentiment analysis models applied to social media text.
A CNN model was developed and trained on three datasets: a sarcasm dataset, a sentiment dataset, and an emoticon dataset.
The results revealed that the model achieved an accuracy of 96.47% on the sarcasm dataset, with the lowest accuracy for class 1.
The amalgamation of sarcasm and sentiment datasets improved the accuracy of the model to 95.1%, and the addition of emoticon dataset has a slight positive impact on the accuracy of the model to 95.37%.
arXiv Detail & Related papers (2023-01-28T22:21:51Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - A Multi-Level Attention Model for Evidence-Based Fact Checking [58.95413968110558]
We present a simple model that can be trained on sequence structures.
Results on a large-scale dataset for Fact Extraction and VERification show that our model outperforms the graph-based approaches.
arXiv Detail & Related papers (2021-06-02T05:40:12Z) - Detecting of a Patient's Condition From Clinical Narratives Using
Natural Language Representation [0.3149883354098941]
This paper proposes a joint clinical natural language representation learning and supervised classification framework.
The novel framework jointly discovers distributional syntactic and latent semantic (representation learning) from contextual clinical narrative inputs.
The proposed framework yields an overall classification performance with accuracy, recall, and precision of 89 % and 88 %, 89 %, respectively.
arXiv Detail & Related papers (2021-04-08T17:16:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.