BAKSA at SemEval-2020 Task 9: Bolstering CNN with Self-Attention for
Sentiment Analysis of Code Mixed Text
- URL: http://arxiv.org/abs/2007.10819v1
- Date: Tue, 21 Jul 2020 14:05:51 GMT
- Title: BAKSA at SemEval-2020 Task 9: Bolstering CNN with Self-Attention for
Sentiment Analysis of Code Mixed Text
- Authors: Ayush Kumar, Harsh Agarwal, Keshav Bansal, Ashutosh Modi
- Abstract summary: We present an ensemble architecture of convolutional neural net (CNN) and self-attention based LSTM for sentiment analysis of code-mixed tweets.
We achieved F1 scores of 0.707 and 0.725 on Hindi-English (Hinglish) and Spanish-English (Spanglish) datasets, respectively.
- Score: 4.456122555367167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sentiment Analysis of code-mixed text has diversified applications in opinion
mining ranging from tagging user reviews to identifying social or political
sentiments of a sub-population. In this paper, we present an ensemble
architecture of convolutional neural net (CNN) and self-attention based LSTM
for sentiment analysis of code-mixed tweets. While the CNN component helps in
the classification of positive and negative tweets, the self-attention based
LSTM, helps in the classification of neutral tweets, because of its ability to
identify correct sentiment among multiple sentiment bearing units. We achieved
F1 scores of 0.707 (ranked 5th) and 0.725 (ranked 13th) on Hindi-English
(Hinglish) and Spanish-English (Spanglish) datasets, respectively. The
submissions for Hinglish and Spanglish tasks were made under the usernames
ayushk and harsh_6 respectively.
Related papers
- You Shall Know a Tool by the Traces it Leaves: The Predictability of Sentiment Analysis Tools [74.98850427240464]
We show that sentiment analysis tools disagree on the same dataset.
We show that the sentiment tool used for sentiment annotation can even be predicted from its outcome.
arXiv Detail & Related papers (2024-10-18T17:27:38Z) - Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - Sentiment-Aware Word and Sentence Level Pre-training for Sentiment
Analysis [64.70116276295609]
SentiWSP is a Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks.
SentiWSP achieves new state-of-the-art performance on various sentence-level and aspect-level sentiment classification benchmarks.
arXiv Detail & Related papers (2022-10-18T12:25:29Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - Exploiting BERT For Multimodal Target SentimentClassification Through
Input Space Translation [75.82110684355979]
We introduce a two-stream model that translates images in input space using an object-aware transformer.
We then leverage the translation to construct an auxiliary sentence that provides multimodal information to a language model.
We achieve state-of-the-art performance on two multimodal Twitter datasets.
arXiv Detail & Related papers (2021-08-03T18:02:38Z) - JUNLP@Dravidian-CodeMix-FIRE2020: Sentiment Classification of Code-Mixed
Tweets using Bi-Directional RNN and Language Tags [14.588109573710431]
This paper uses bi-directional LSTMs along with language tagging to facilitate sentiment tagging of code-mixed Tamil texts extracted from social media.
The presented algorithm garnered precision, recall, and F1 scores of 0.59, 0.66, and 0.58 respectively.
arXiv Detail & Related papers (2020-10-20T08:10:29Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z) - LIMSI_UPV at SemEval-2020 Task 9: Recurrent Convolutional Neural Network
for Code-mixed Sentiment Analysis [8.8561720398658]
This paper describes the participation of LIMSI UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text.
The proposed approach competed in SentiMix Hindi-English subtask, that addresses the problem of predicting the sentiment of a given Hindi-English code-mixed tweet.
We propose Recurrent Convolutional Neural Network that combines both the recurrent neural network and the convolutional network to better capture the semantics of the text.
arXiv Detail & Related papers (2020-08-30T13:52:24Z) - C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed
Social Media Text using Feature Engineering [0.9646922337783134]
This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix.
We are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks.
arXiv Detail & Related papers (2020-08-09T00:46:26Z) - IUST at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social
Media Text using Deep Neural Networks and Linear Baselines [6.866104126509981]
We develop a system to predict the sentiment of a given code-mixed tweet.
Our best performing method obtains an F1 score of 0.751 for the Spanish-English sub-task and 0.706 over the Hindi-English sub-task.
arXiv Detail & Related papers (2020-07-24T18:48:37Z) - NITS-Hinglish-SentiMix at SemEval-2020 Task 9: Sentiment Analysis For
Code-Mixed Social Media Text Using an Ensemble Model [1.1265248232450553]
This work proposes a system named NITS-Hinglish-SentiMix to viably complete the sentiment analysis of code-mixed Hinglish text.
The proposed framework has recorded an F-Score of 0.617 on the test data.
arXiv Detail & Related papers (2020-07-23T15:45:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.