NITS-Hinglish-SentiMix at SemEval-2020 Task 9: Sentiment Analysis For
Code-Mixed Social Media Text Using an Ensemble Model
- URL: http://arxiv.org/abs/2007.12081v2
- Date: Fri, 4 Sep 2020 17:55:18 GMT
- Title: NITS-Hinglish-SentiMix at SemEval-2020 Task 9: Sentiment Analysis For
Code-Mixed Social Media Text Using an Ensemble Model
- Authors: Subhra Jyoti Baroi, Nivedita Singh, Ringki Das, Thoudam Doren Singh
- Abstract summary: This work proposes a system named NITS-Hinglish-SentiMix to viably complete the sentiment analysis of code-mixed Hinglish text.
The proposed framework has recorded an F-Score of 0.617 on the test data.
- Score: 1.1265248232450553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sentiment Analysis is the process of deciphering what a sentence emotes and
classifying them as either positive, negative, or neutral. In recent times,
India has seen a huge influx in the number of active social media users and
this has led to a plethora of unstructured text data. Since the Indian
population is generally fluent in both Hindi and English, they end up
generating code-mixed Hinglish social media text i.e. the expressions of Hindi
language, written in the Roman script alongside other English words. The
ability to adequately comprehend the notions in these texts is truly necessary.
Our team, rns2020 participated in Task 9 at SemEval2020 intending to design a
system to carry out the sentiment analysis of code-mixed social media text.
This work proposes a system named NITS-Hinglish-SentiMix to viably complete the
sentiment analysis of such code-mixed Hinglish text. The proposed framework has
recorded an F-Score of 0.617 on the test data.
Related papers
- CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving [61.73180469072787]
We focus on the problem of spoken translation (ST) of code-switched speech in Indian languages to English text.
We present a new end-to-end model architecture COSTA that scaffolds on pretrained automatic speech recognition (ASR) and machine translation (MT) modules.
COSTA significantly outperforms many competitive cascaded and end-to-end multimodal baselines by up to 3.5 BLEU points.
arXiv Detail & Related papers (2024-06-16T16:10:51Z) - SemEval 2024 -- Task 10: Emotion Discovery and Reasoning its Flip in
Conversation (EDiReF) [61.49972925493912]
SemEval-2024 Task 10 is a shared task centred on identifying emotions in code-mixed dialogues.
This task comprises three distinct subtasks - emotion recognition in conversation for code-mixed dialogues, emotion flip reasoning for code-mixed dialogues, and emotion flip reasoning for English dialogues.
A total of 84 participants engaged in this task, with the most adept systems attaining F1-scores of 0.70, 0.79, and 0.76 for the respective subtasks.
arXiv Detail & Related papers (2024-02-29T08:20:06Z) - Transformer-based Model for Word Level Language Identification in
Code-mixed Kannada-English Texts [55.41644538483948]
We propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts.
The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.
arXiv Detail & Related papers (2022-11-26T02:39:19Z) - TextHide: Tackling Data Privacy in Language Understanding Tasks [54.11691303032022]
TextHide mitigates privacy risks without slowing down training or reducing accuracy.
It requires all participants to add a simple encryption step to prevent an eavesdropping attacker from recovering private text data.
We evaluate TextHide on the GLUE benchmark, and our experiments show that TextHide can effectively defend attacks on shared gradients or representations.
arXiv Detail & Related papers (2020-10-12T22:22:15Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z) - LIMSI_UPV at SemEval-2020 Task 9: Recurrent Convolutional Neural Network
for Code-mixed Sentiment Analysis [8.8561720398658]
This paper describes the participation of LIMSI UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text.
The proposed approach competed in SentiMix Hindi-English subtask, that addresses the problem of predicting the sentiment of a given Hindi-English code-mixed tweet.
We propose Recurrent Convolutional Neural Network that combines both the recurrent neural network and the convolutional network to better capture the semantics of the text.
arXiv Detail & Related papers (2020-08-30T13:52:24Z) - C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed
Social Media Text using Feature Engineering [0.9646922337783134]
This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix.
We are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks.
arXiv Detail & Related papers (2020-08-09T00:46:26Z) - SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual
Media [50.29389719723529]
We present the main findings and compare the results of SemEval-2020 Task 10, Emphasis Selection for Written Text in Visual Media.
The goal of this shared task is to design automatic methods for emphasis selection.
The analysis of systems submitted to the task indicates that BERT and RoBERTa were the most common choice of pre-trained models used.
arXiv Detail & Related papers (2020-08-07T17:24:53Z) - ULD@NUIG at SemEval-2020 Task 9: Generative Morphemes with an Attention
Model for Sentiment Analysis in Code-Mixed Text [1.4926515182392508]
We present the Generative Morphemes with Attention (GenMA) Model sentiment analysis system contributed to SemEval 2020 Task 9 SentiMix.
The system aims to predict the sentiments of the given English-Hindi code-mixed tweets without using word-level language tags.
arXiv Detail & Related papers (2020-07-27T23:58:54Z) - IUST at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social
Media Text using Deep Neural Networks and Linear Baselines [6.866104126509981]
We develop a system to predict the sentiment of a given code-mixed tweet.
Our best performing method obtains an F1 score of 0.751 for the Spanish-English sub-task and 0.706 over the Hindi-English sub-task.
arXiv Detail & Related papers (2020-07-24T18:48:37Z) - JUNLP@SemEval-2020 Task 9:Sentiment Analysis of Hindi-English code mixed
data using Grid Search Cross Validation [3.5169472410785367]
We focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis.
This work was done as participation in the SemEval-2020 Sentimix Task.
arXiv Detail & Related papers (2020-07-24T15:06:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.