WESSA at SemEval-2020 Task 9: Code-Mixed Sentiment Analysis using
Transformers
- URL: http://arxiv.org/abs/2009.09879v1
- Date: Mon, 21 Sep 2020 13:59:24 GMT
- Title: WESSA at SemEval-2020 Task 9: Code-Mixed Sentiment Analysis using
Transformers
- Authors: Ahmed Sultan (WideBot), Mahmoud Salim (WideBot), Amina Gaber
(WideBot), Islam El Hosary (WideBot)
- Abstract summary: We describe our system submitted for SemEval 2020 Task 9, Sentiment Analysis for Code-Mixed Social Media Text.
Our best performing system is a Transfer Learning-based model that fine-tunes "XLM-RoBERTa"
For later submissions, our system manages to achieve a 75.9% average F1-Score on the test set using CodaLab username "ahmed0sultan"
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we describe our system submitted for SemEval 2020 Task 9,
Sentiment Analysis for Code-Mixed Social Media Text alongside other
experiments. Our best performing system is a Transfer Learning-based model that
fine-tunes "XLM-RoBERTa", a transformer-based multilingual masked language
model, on monolingual English and Spanish data and Spanish-English code-mixed
data. Our system outperforms the official task baseline by achieving a 70.1%
average F1-Score on the official leaderboard using the test set. For later
submissions, our system manages to achieve a 75.9% average F1-Score on the test
set using CodaLab username "ahmed0sultan".
Related papers
- Strategies for improving low resource speech to text translation relying
on pre-trained ASR models [59.90106959717875]
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST)
We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively.
arXiv Detail & Related papers (2023-05-31T21:58:07Z) - BJTU-WeChat's Systems for the WMT22 Chat Translation Task [66.81525961469494]
This paper introduces the joint submission of the Beijing Jiaotong University and WeChat AI to the WMT'22 chat translation task for English-German.
Based on the Transformer, we apply several effective variants.
Our systems achieve 0.810 and 0.946 COMET scores.
arXiv Detail & Related papers (2022-11-28T02:35:04Z) - Transformer-based Model for Word Level Language Identification in
Code-mixed Kannada-English Texts [55.41644538483948]
We propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts.
The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.
arXiv Detail & Related papers (2022-11-26T02:39:19Z) - Tencent AI Lab - Shanghai Jiao Tong University Low-Resource Translation
System for the WMT22 Translation Task [49.916963624249355]
This paper describes Tencent AI Lab - Shanghai Jiao Tong University (TAL-SJTU) Low-Resource Translation systems for the WMT22 shared task.
We participate in the general translation task on English$Leftrightarrow$Livonian.
Our system is based on M2M100 with novel techniques that adapt it to the target language pair.
arXiv Detail & Related papers (2022-10-17T04:34:09Z) - Palomino-Ochoa at SemEval-2020 Task 9: Robust System based on
Transformer for Code-Mixed Sentiment Classification [1.6244541005112747]
We present a transfer learning system to perform a mixed Spanish-English sentiment classification task.
Our proposal uses the state-of-the-art language model BERT and embed it within a ULMFiT transfer learning pipeline.
arXiv Detail & Related papers (2020-11-18T18:25:58Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z) - UPB at SemEval-2020 Task 9: Identifying Sentiment in Code-Mixed Social
Media Texts using Transformers and Multi-Task Learning [1.7196613099537055]
We describe the systems developed by our team for SemEval-2020 Task 9.
We aim to cover two well-known code-mixed languages: Hindi-English and Spanish-English.
Our approach achieves promising performance on the Hindi-English task, with an average F1-score of 0.6850.
For the Spanish-English task, we obtained an average F1-score of 0.7064 ranking our team 17th out of 29 participants.
arXiv Detail & Related papers (2020-09-06T17:19:18Z) - LIMSI_UPV at SemEval-2020 Task 9: Recurrent Convolutional Neural Network
for Code-mixed Sentiment Analysis [8.8561720398658]
This paper describes the participation of LIMSI UPV team in SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social Media Text.
The proposed approach competed in SentiMix Hindi-English subtask, that addresses the problem of predicting the sentiment of a given Hindi-English code-mixed tweet.
We propose Recurrent Convolutional Neural Network that combines both the recurrent neural network and the convolutional network to better capture the semantics of the text.
arXiv Detail & Related papers (2020-08-30T13:52:24Z) - Decision Tree J48 at SemEval-2020 Task 9: Sentiment Analysis for
Code-Mixed Social Media Text (Hinglish) [3.007778295477907]
This system uses Weka as a tool for providing the classifier for the classification of tweets.
python is used for loading the data from the files provided and cleaning it.
The system performance was assessed using the official competition evaluation metric F1-score.
arXiv Detail & Related papers (2020-08-26T06:30:43Z) - Voice@SRIB at SemEval-2020 Task 9 and 12: Stacked Ensembling method for
Sentiment and Offensiveness detection in Social Media [2.9008108937701333]
We train embeddings, ensembling methods for Sentimix, and OffensEval tasks.
We evaluate our models on macro F1-score, precision, accuracy, and recall on the datasets.
arXiv Detail & Related papers (2020-07-20T11:54:43Z) - Enhanced Universal Dependency Parsing with Second-Order Inference and
Mixture of Training Data [48.8386313914471]
This paper presents the system used in our submission to the textitIWPT 2020 Shared Task.
For the low-resource Tamil corpus, we specially mixed the training data of Tamil with other languages and significantly improved the performance of Tamil.
arXiv Detail & Related papers (2020-06-02T06:42:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.