IUST at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social
Media Text using Deep Neural Networks and Linear Baselines
- URL: http://arxiv.org/abs/2007.12733v1
- Date: Fri, 24 Jul 2020 18:48:37 GMT
- Title: IUST at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social
Media Text using Deep Neural Networks and Linear Baselines
- Authors: Soroush Javdan, Taha Shangipour ataei and Behrouz Minaei-Bidgoli
- Abstract summary: We develop a system to predict the sentiment of a given code-mixed tweet.
Our best performing method obtains an F1 score of 0.751 for the Spanish-English sub-task and 0.706 over the Hindi-English sub-task.
- Score: 6.866104126509981
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sentiment Analysis is a well-studied field of Natural Language Processing.
However, the rapid growth of social media and noisy content within them poses
significant challenges in addressing this problem with well-established methods
and tools. One of these challenges is code-mixing, which means using different
languages to convey thoughts in social media texts. Our group, with the name of
IUST(username: TAHA), participated at the SemEval-2020 shared task 9 on
Sentiment Analysis for Code-Mixed Social Media Text, and we have attempted to
develop a system to predict the sentiment of a given code-mixed tweet. We used
different preprocessing techniques and proposed to use different methods that
vary from NBSVM to more complicated deep neural network models. Our best
performing method obtains an F1 score of 0.751 for the Spanish-English sub-task
and 0.706 over the Hindi-English sub-task.
Related papers
- A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus [71.77214818319054]
Natural language inference is a proxy for natural language understanding.
There is no publicly available NLI corpus for the Romanian language.
We introduce the first Romanian NLI corpus (RoNLI) comprising 58K training sentence pairs.
arXiv Detail & Related papers (2024-05-20T08:41:15Z) - A Vector Quantized Approach for Text to Speech Synthesis on Real-World
Spontaneous Speech [94.64927912924087]
We train TTS systems using real-world speech from YouTube and podcasts.
Recent Text-to-Speech architecture is designed for multiple code generation and monotonic alignment.
We show thatRecent Text-to-Speech architecture outperforms existing TTS systems in several objective and subjective measures.
arXiv Detail & Related papers (2023-02-08T17:34:32Z) - Transformer-based Model for Word Level Language Identification in
Code-mixed Kannada-English Texts [55.41644538483948]
We propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts.
The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.
arXiv Detail & Related papers (2022-11-26T02:39:19Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z) - UPB at SemEval-2020 Task 9: Identifying Sentiment in Code-Mixed Social
Media Texts using Transformers and Multi-Task Learning [1.7196613099537055]
We describe the systems developed by our team for SemEval-2020 Task 9.
We aim to cover two well-known code-mixed languages: Hindi-English and Spanish-English.
Our approach achieves promising performance on the Hindi-English task, with an average F1-score of 0.6850.
For the Spanish-English task, we obtained an average F1-score of 0.7064 ranking our team 17th out of 29 participants.
arXiv Detail & Related papers (2020-09-06T17:19:18Z) - LIMSI_UPV at SemEval-2020 Task 9: Recurrent Convolutional Neural Network
for Code-mixed Sentiment Analysis [8.8561720398658]
This paper describes the participation of LIMSI UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text.
The proposed approach competed in SentiMix Hindi-English subtask, that addresses the problem of predicting the sentiment of a given Hindi-English code-mixed tweet.
We propose Recurrent Convolutional Neural Network that combines both the recurrent neural network and the convolutional network to better capture the semantics of the text.
arXiv Detail & Related papers (2020-08-30T13:52:24Z) - C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed
Social Media Text using Feature Engineering [0.9646922337783134]
This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix.
We are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks.
arXiv Detail & Related papers (2020-08-09T00:46:26Z) - JUNLP@SemEval-2020 Task 9:Sentiment Analysis of Hindi-English code mixed
data using Grid Search Cross Validation [3.5169472410785367]
We focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis.
This work was done as participation in the SemEval-2020 Sentimix Task.
arXiv Detail & Related papers (2020-07-24T15:06:48Z) - BAKSA at SemEval-2020 Task 9: Bolstering CNN with Self-Attention for
Sentiment Analysis of Code Mixed Text [4.456122555367167]
We present an ensemble architecture of convolutional neural net (CNN) and self-attention based LSTM for sentiment analysis of code-mixed tweets.
We achieved F1 scores of 0.707 and 0.725 on Hindi-English (Hinglish) and Spanish-English (Spanglish) datasets, respectively.
arXiv Detail & Related papers (2020-07-21T14:05:51Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.