Convolutional Neural Networks for Sentiment Analysis on Weibo Data: A
Natural Language Processing Approach
- URL: http://arxiv.org/abs/2307.06540v1
- Date: Thu, 13 Jul 2023 03:02:56 GMT
- Title: Convolutional Neural Networks for Sentiment Analysis on Weibo Data: A
Natural Language Processing Approach
- Authors: Yufei Xie and Rodolfo C. Raga Jr
- Abstract summary: This study addresses the complex task of sentiment analysis on a dataset of 119,988 original tweets from Weibo using a Convolutional Neural Network (CNN)
A CNN-based model was utilized, leveraging word embeddings for feature extraction, and trained to perform sentiment classification.
The model achieved a macro-average F1-score of approximately 0.73 on the test set, showing balanced performance across positive, neutral, and negative sentiments.
- Score: 0.228438857884398
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This study addressed the complex task of sentiment analysis on a dataset of
119,988 original tweets from Weibo using a Convolutional Neural Network (CNN),
offering a new approach to Natural Language Processing (NLP). The data, sourced
from Baidu's PaddlePaddle AI platform, were meticulously preprocessed,
tokenized, and categorized based on sentiment labels. A CNN-based model was
utilized, leveraging word embeddings for feature extraction, and trained to
perform sentiment classification. The model achieved a macro-average F1-score
of approximately 0.73 on the test set, showing balanced performance across
positive, neutral, and negative sentiments. The findings underscore the
effectiveness of CNNs for sentiment analysis tasks, with implications for
practical applications in social media analysis, market research, and policy
studies. The complete experimental content and code have been made publicly
available on the Kaggle data platform for further research and development.
Future work may involve exploring different architectures, such as Recurrent
Neural Networks (RNN) or transformers, or using more complex pre-trained models
like BERT, to further improve the model's ability to understand linguistic
nuances and context.
Related papers
- Improving Neuron-level Interpretability with White-box Language Models [11.898535906016907]
We introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE)
Our comprehensive experiments showcase significant improvements (up to 103% relative improvement) in neuron-level interpretability.
CRATE's increased interpretability comes from its enhanced ability to consistently and distinctively activate on relevant tokens.
arXiv Detail & Related papers (2024-10-21T19:12:33Z) - A Sentiment Analysis of Medical Text Based on Deep Learning [1.8130068086063336]
This paper focuses on the medical domain, using bidirectional encoder representations from transformers (BERT) as the basic pre-trained model.
Experiments and analyses were conducted on the METS-CoV dataset to explore the training performance after integrating different deep learning networks.
CNN models outperform other networks when trained on smaller medical text datasets in combination with pre-trained models like BERT.
arXiv Detail & Related papers (2024-04-16T12:20:49Z) - Initial Study into Application of Feature Density and
Linguistically-backed Embedding to Improve Machine Learning-based
Cyberbullying Detection [54.83707803301847]
The research was conducted on a Formspring dataset provided in a Kaggle competition on automatic cyberbullying detection.
The study confirmed the effectiveness of Neural Networks in cyberbullying detection and the correlation between classifier performance and Feature Density.
arXiv Detail & Related papers (2022-06-04T03:17:15Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Improving Classifier Training Efficiency for Automatic Cyberbullying
Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods.
We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments.
The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - A Novel Deep Learning Method for Textual Sentiment Analysis [3.0711362702464675]
This paper proposes a convolutional neural network integrated with a hierarchical attention layer to extract informative words.
The proposed model has higher classification accuracy and can extract informative words.
Applying incremental transfer learning can significantly enhance the classification performance.
arXiv Detail & Related papers (2021-02-23T12:11:36Z) - Neural Networks Enhancement with Logical Knowledge [83.9217787335878]
We propose an extension of KENN for relational data.
The results show that KENN is capable of increasing the performances of the underlying neural network even in the presence relational data.
arXiv Detail & Related papers (2020-09-13T21:12:20Z) - SHAP values for Explaining CNN-based Text Classification Models [10.881494765759829]
This paper develops a methodology to compute SHAP values for local explainability of CNN-based text classification models.
The approach is also extended to compute global scores to assess the importance of features.
arXiv Detail & Related papers (2020-08-26T21:28:41Z) - An Evaluation of Recent Neural Sequence Tagging Models in Turkish Named
Entity Recognition [5.161531917413708]
We propose a transformer-based network with a conditional random field layer that leads to the state-of-the-art result.
Our study contributes to the literature that quantifies the impact of transfer learning on processing morphologically rich languages.
arXiv Detail & Related papers (2020-05-14T06:54:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.