HPCC-YNU at SemEval-2020 Task 9: A Bilingual Vector Gating Mechanism for
Sentiment Analysis of Code-Mixed Text
- URL: http://arxiv.org/abs/2010.04935v1
- Date: Sat, 10 Oct 2020 08:02:15 GMT
- Title: HPCC-YNU at SemEval-2020 Task 9: A Bilingual Vector Gating Mechanism for
Sentiment Analysis of Code-Mixed Text
- Authors: Jun Kong, Jin Wang and Xuejie Zhang
- Abstract summary: This paper presents a system that uses a bilingual vector gating mechanism for bilingual resources to complete the task.
We achieved fifth place in Spanglish and 19th place in Hinglish.
- Score: 10.057804086733576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is fairly common to use code-mixing on a social media platform to express
opinions and emotions in multilingual societies. The purpose of this task is to
detect the sentiment of code-mixed social media text. Code-mixed text poses a
great challenge for the traditional NLP system, which currently uses
monolingual resources to deal with the problem of multilingual mixing. This
task has been solved in the past using lexicon lookup in respective sentiment
dictionaries and using a long short-term memory (LSTM) neural network for
monolingual resources. In this paper, we (my codalab username is kongjun)
present a system that uses a bilingual vector gating mechanism for bilingual
resources to complete the task. The model consists of two main parts: the
vector gating mechanism, which combines the character and word levels, and the
attention mechanism, which extracts the important emotional parts of the text.
The results show that the proposed system outperforms the baseline algorithm.
We achieved fifth place in Spanglish and 19th place in Hinglish.The code of
this paper is availabled at : https://github.com/JunKong5/Semveal2020-task9
Related papers
- A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models.
We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z) - DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System
for Multilingual Named Entity Recognition [94.90258603217008]
The MultiCoNER RNum2 shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios.
Previous top systems in the MultiCoNER RNum1 either incorporate the knowledge bases or gazetteers.
We propose a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER.
arXiv Detail & Related papers (2023-05-05T16:59:26Z) - Transformer-based Model for Word Level Language Identification in
Code-mixed Kannada-English Texts [55.41644538483948]
We propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts.
The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.
arXiv Detail & Related papers (2022-11-26T02:39:19Z) - Evaluating Input Representation for Language Identification in
Hindi-English Code Mixed Text [4.4904382374090765]
Code-mixed text comprises text written in more than one language.
People naturally tend to combine local language with global languages like English.
In this work, we focus on language identification in code-mixed sentences for Hindi-English mixed text.
arXiv Detail & Related papers (2020-11-23T08:08:09Z) - gundapusunil at SemEval-2020 Task 9: Syntactic Semantic LSTM
Architecture for SENTIment Analysis of Code-MIXed Data [7.538482310185133]
We have developed a system for SemEval 2020: Task 9 on Sentiment Analysis for Code-Mixed Social Media Text.
Our system first generates two types of embeddings for the social media text.
arXiv Detail & Related papers (2020-10-09T07:07:04Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z) - ULD@NUIG at SemEval-2020 Task 9: Generative Morphemes with an Attention
Model for Sentiment Analysis in Code-Mixed Text [1.4926515182392508]
We present the Generative Morphemes with Attention (GenMA) Model sentiment analysis system contributed to SemEval 2020 Task 9 SentiMix.
The system aims to predict the sentiments of the given English-Hindi code-mixed tweets without using word-level language tags.
arXiv Detail & Related papers (2020-07-27T23:58:54Z) - JUNLP@SemEval-2020 Task 9:Sentiment Analysis of Hindi-English code mixed
data using Grid Search Cross Validation [3.5169472410785367]
We focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis.
This work was done as participation in the SemEval-2020 Sentimix Task.
arXiv Detail & Related papers (2020-07-24T15:06:48Z) - IIT Gandhinagar at SemEval-2020 Task 9: Code-Mixed Sentiment
Classification Using Candidate Sentence Generation and Selection [1.2301855531996841]
Code-mixing adds to the challenge of analyzing the sentiment of the text due to the non-standard writing style.
We present a candidate sentence generation and selection based approach on top of the Bi-LSTM based neural classifier.
The proposed approach shows an improvement in the system performance as compared to the Bi-LSTM based neural classifier.
arXiv Detail & Related papers (2020-06-25T14:59:47Z) - A Multi-Perspective Architecture for Semantic Code Search [58.73778219645548]
We propose a novel multi-perspective cross-lingual neural framework for code--text matching.
Our experiments on the CoNaLa dataset show that our proposed model yields better performance than previous approaches.
arXiv Detail & Related papers (2020-05-06T04:46:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.