Related papers: C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering

C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering

URL: http://arxiv.org/abs/2008.13549v1
Date: Sun, 9 Aug 2020 00:46:26 GMT
Title: C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering
Authors: Laksh Advani and Clement Lu and Suraj Maharjan
Abstract summary: This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix. We are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks.
Score: 0.9646922337783134
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In today's interconnected and multilingual world, code-mixing of languages on social media is a common occurrence. While many Natural Language Processing (NLP) tasks like sentiment analysis are mature and well designed for monolingual text, techniques to apply these tasks to code-mixed text still warrant exploration. This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix. We tackle this problem by leveraging a set of hand-engineered lexical, sentiment, and metadata features to design a classifier that can disambiguate between "positive", "negative" and "neutral" sentiment. With this model, we are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks

Related papers

A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models. We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z)
SemEval 2024 -- Task 10: Emotion Discovery and Reasoning its Flip in Conversation (EDiReF) [61.49972925493912]
SemEval-2024 Task 10 is a shared task centred on identifying emotions in code-mixed dialogues. This task comprises three distinct subtasks - emotion recognition in conversation for code-mixed dialogues, emotion flip reasoning for code-mixed dialogues, and emotion flip reasoning for English dialogues. A total of 84 participants engaged in this task, with the most adept systems attaining F1-scores of 0.70, 0.79, and 0.76 for the respective subtasks.
arXiv Detail & Related papers (2024-02-29T08:20:06Z)
SESCORE2: Learning Text Generation Evaluation via Synthesizing Realistic Mistakes [93.19166902594168]
We propose SESCORE2, a self-supervised approach for training a model-based metric for text generation evaluation. Key concept is to synthesize realistic model mistakes by perturbing sentences retrieved from a corpus. We evaluate SESCORE2 and previous methods on four text generation tasks across three languages.
arXiv Detail & Related papers (2022-12-19T09:02:16Z)
Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts [55.41644538483948]
We propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts. The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.
arXiv Detail & Related papers (2022-11-26T02:39:19Z)
JUNLP@Dravidian-CodeMix-FIRE2020: Sentiment Classification of Code-Mixed Tweets using Bi-Directional RNN and Language Tags [14.588109573710431]
This paper uses bi-directional LSTMs along with language tagging to facilitate sentiment tagging of code-mixed Tamil texts extracted from social media. The presented algorithm garnered precision, recall, and F1 scores of 0.59, 0.66, and 0.58 respectively.
arXiv Detail & Related papers (2020-10-20T08:10:29Z)
NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message. We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z)
SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual Media [50.29389719723529]
We present the main findings and compare the results of SemEval-2020 Task 10, Emphasis Selection for Written Text in Visual Media. The goal of this shared task is to design automatic methods for emphasis selection. The analysis of systems submitted to the task indicates that BERT and RoBERTa were the most common choice of pre-trained models used.
arXiv Detail & Related papers (2020-08-07T17:24:53Z)
IUST at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text using Deep Neural Networks and Linear Baselines [6.866104126509981]
We develop a system to predict the sentiment of a given code-mixed tweet. Our best performing method obtains an F1 score of 0.751 for the Spanish-English sub-task and 0.706 over the Hindi-English sub-task.
arXiv Detail & Related papers (2020-07-24T18:48:37Z)
JUNLP@SemEval-2020 Task 9:Sentiment Analysis of Hindi-English code mixed data using Grid Search Cross Validation [3.5169472410785367]
We focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis. This work was done as participation in the SemEval-2020 Sentimix Task.
arXiv Detail & Related papers (2020-07-24T15:06:48Z)
NITS-Hinglish-SentiMix at SemEval-2020 Task 9: Sentiment Analysis For Code-Mixed Social Media Text Using an Ensemble Model [1.1265248232450553]
This work proposes a system named NITS-Hinglish-SentiMix to viably complete the sentiment analysis of code-mixed Hinglish text. The proposed framework has recorded an F-Score of 0.617 on the test data.
arXiv Detail & Related papers (2020-07-23T15:45:12Z)
HCMS at SemEval-2020 Task 9: A Neural Approach to Sentiment Analysis for Code-Mixed Texts [0.0]
We describe our submission to the Sentimix Hindi-English task involving sentiment classification of code-mixed texts. With an F1 score of 67.1%, we demonstrate that simple convolution and attention may well produce reasonable results.
arXiv Detail & Related papers (2020-07-23T15:39:53Z)
Enabling Language Models to Fill in the Blanks [81.59381915581892]
We present a simple approach for text infilling, the task of predicting missing spans of text at any position in a document. We train (or fine-tune) off-the-shelf language models on sequences containing the concatenation of artificially-masked text and the text which was masked. We show that this approach, which we call infilling by language modeling, can enable LMs to infill entire sentences effectively on three different domains: short stories, scientific abstracts, and lyrics.
arXiv Detail & Related papers (2020-05-11T18:00:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.