C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed
Social Media Text using Feature Engineering
- URL: http://arxiv.org/abs/2008.13549v1
- Date: Sun, 9 Aug 2020 00:46:26 GMT
- Title: C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed
Social Media Text using Feature Engineering
- Authors: Laksh Advani and Clement Lu and Suraj Maharjan
- Abstract summary: This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix.
We are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks.
- Score: 0.9646922337783134
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In today's interconnected and multilingual world, code-mixing of languages on
social media is a common occurrence. While many Natural Language Processing
(NLP) tasks like sentiment analysis are mature and well designed for
monolingual text, techniques to apply these tasks to code-mixed text still
warrant exploration. This paper describes our feature engineering approach to
sentiment analysis in code-mixed social media text for SemEval-2020 Task 9:
SentiMix. We tackle this problem by leveraging a set of hand-engineered
lexical, sentiment, and metadata features to design a classifier that can
disambiguate between "positive", "negative" and "neutral" sentiment. With this
model, we are able to obtain a weighted F1 score of 0.65 for the "Hinglish"
task and 0.63 for the "Spanglish" tasks
Related papers
- A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models.
We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z) - SemEval 2024 -- Task 10: Emotion Discovery and Reasoning its Flip in
Conversation (EDiReF) [61.49972925493912]
SemEval-2024 Task 10 is a shared task centred on identifying emotions in code-mixed dialogues.
This task comprises three distinct subtasks - emotion recognition in conversation for code-mixed dialogues, emotion flip reasoning for code-mixed dialogues, and emotion flip reasoning for English dialogues.
A total of 84 participants engaged in this task, with the most adept systems attaining F1-scores of 0.70, 0.79, and 0.76 for the respective subtasks.
arXiv Detail & Related papers (2024-02-29T08:20:06Z) - SESCORE2: Learning Text Generation Evaluation via Synthesizing Realistic
Mistakes [93.19166902594168]
We propose SESCORE2, a self-supervised approach for training a model-based metric for text generation evaluation.
Key concept is to synthesize realistic model mistakes by perturbing sentences retrieved from a corpus.
We evaluate SESCORE2 and previous methods on four text generation tasks across three languages.
arXiv Detail & Related papers (2022-12-19T09:02:16Z) - Transformer-based Model for Word Level Language Identification in
Code-mixed Kannada-English Texts [55.41644538483948]
We propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts.
The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.
arXiv Detail & Related papers (2022-11-26T02:39:19Z) - JUNLP@Dravidian-CodeMix-FIRE2020: Sentiment Classification of Code-Mixed
Tweets using Bi-Directional RNN and Language Tags [14.588109573710431]
This paper uses bi-directional LSTMs along with language tagging to facilitate sentiment tagging of code-mixed Tamil texts extracted from social media.
The presented algorithm garnered precision, recall, and F1 scores of 0.59, 0.66, and 0.58 respectively.
arXiv Detail & Related papers (2020-10-20T08:10:29Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z) - SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual
Media [50.29389719723529]
We present the main findings and compare the results of SemEval-2020 Task 10, Emphasis Selection for Written Text in Visual Media.
The goal of this shared task is to design automatic methods for emphasis selection.
The analysis of systems submitted to the task indicates that BERT and RoBERTa were the most common choice of pre-trained models used.
arXiv Detail & Related papers (2020-08-07T17:24:53Z) - IUST at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social
Media Text using Deep Neural Networks and Linear Baselines [6.866104126509981]
We develop a system to predict the sentiment of a given code-mixed tweet.
Our best performing method obtains an F1 score of 0.751 for the Spanish-English sub-task and 0.706 over the Hindi-English sub-task.
arXiv Detail & Related papers (2020-07-24T18:48:37Z) - JUNLP@SemEval-2020 Task 9:Sentiment Analysis of Hindi-English code mixed
data using Grid Search Cross Validation [3.5169472410785367]
We focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis.
This work was done as participation in the SemEval-2020 Sentimix Task.
arXiv Detail & Related papers (2020-07-24T15:06:48Z) - NITS-Hinglish-SentiMix at SemEval-2020 Task 9: Sentiment Analysis For
Code-Mixed Social Media Text Using an Ensemble Model [1.1265248232450553]
This work proposes a system named NITS-Hinglish-SentiMix to viably complete the sentiment analysis of code-mixed Hinglish text.
The proposed framework has recorded an F-Score of 0.617 on the test data.
arXiv Detail & Related papers (2020-07-23T15:45:12Z) - HCMS at SemEval-2020 Task 9: A Neural Approach to Sentiment Analysis for
Code-Mixed Texts [0.0]
We describe our submission to the Sentimix Hindi-English task involving sentiment classification of code-mixed texts.
With an F1 score of 67.1%, we demonstrate that simple convolution and attention may well produce reasonable results.
arXiv Detail & Related papers (2020-07-23T15:39:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.