Related papers: HPCC-YNU at SemEval-2020 Task 9: A Bilingual Vector Gating Mechanism for Sentiment Analysis of Code-Mixed Text

HPCC-YNU at SemEval-2020 Task 9: A Bilingual Vector Gating Mechanism for Sentiment Analysis of Code-Mixed Text

URL: http://arxiv.org/abs/2010.04935v1
Date: Sat, 10 Oct 2020 08:02:15 GMT
Title: HPCC-YNU at SemEval-2020 Task 9: A Bilingual Vector Gating Mechanism for Sentiment Analysis of Code-Mixed Text
Authors: Jun Kong, Jin Wang and Xuejie Zhang
Abstract summary: This paper presents a system that uses a bilingual vector gating mechanism for bilingual resources to complete the task. We achieved fifth place in Spanglish and 19th place in Hinglish.
Score: 10.057804086733576
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is fairly common to use code-mixing on a social media platform to express opinions and emotions in multilingual societies. The purpose of this task is to detect the sentiment of code-mixed social media text. Code-mixed text poses a great challenge for the traditional NLP system, which currently uses monolingual resources to deal with the problem of multilingual mixing. This task has been solved in the past using lexicon lookup in respective sentiment dictionaries and using a long short-term memory (LSTM) neural network for monolingual resources. In this paper, we (my codalab username is kongjun) present a system that uses a bilingual vector gating mechanism for bilingual resources to complete the task. The model consists of two main parts: the vector gating mechanism, which combines the character and word levels, and the attention mechanism, which extracts the important emotional parts of the text. The results show that the proposed system outperforms the baseline algorithm. We achieved fifth place in Spanglish and 19th place in Hinglish.The code of this paper is availabled at : https://github.com/JunKong5/Semveal2020-task9

Related papers

A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models. We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z)
DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition [94.90258603217008]
The MultiCoNER RNum2 shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios. Previous top systems in the MultiCoNER RNum1 either incorporate the knowledge bases or gazetteers. We propose a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER.
arXiv Detail & Related papers (2023-05-05T16:59:26Z)
Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts [55.41644538483948]
We propose the use of a Transformer based model for word-level language identification in code-mixed Kannada English texts. The proposed model on the CoLI-Kenglish dataset achieves a weighted F1-score of 0.84 and a macro F1-score of 0.61.
arXiv Detail & Related papers (2022-11-26T02:39:19Z)
Evaluating Input Representation for Language Identification in Hindi-English Code Mixed Text [4.4904382374090765]
Code-mixed text comprises text written in more than one language. People naturally tend to combine local language with global languages like English. In this work, we focus on language identification in code-mixed sentences for Hindi-English mixed text.
arXiv Detail & Related papers (2020-11-23T08:08:09Z)
VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages. It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language. The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z)
gundapusunil at SemEval-2020 Task 9: Syntactic Semantic LSTM Architecture for SENTIment Analysis of Code-MIXed Data [7.538482310185133]
We have developed a system for SemEval 2020: Task 9 on Sentiment Analysis for Code-Mixed Social Media Text. Our system first generates two types of embeddings for the social media text.
arXiv Detail & Related papers (2020-10-09T07:07:04Z)
FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning. During inference, the model makes predictions based on the text input in the target language and its translation in the source language. To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message. We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z)
ULD@NUIG at SemEval-2020 Task 9: Generative Morphemes with an Attention Model for Sentiment Analysis in Code-Mixed Text [1.4926515182392508]
We present the Generative Morphemes with Attention (GenMA) Model sentiment analysis system contributed to SemEval 2020 Task 9 SentiMix. The system aims to predict the sentiments of the given English-Hindi code-mixed tweets without using word-level language tags.
arXiv Detail & Related papers (2020-07-27T23:58:54Z)
JUNLP@SemEval-2020 Task 9:Sentiment Analysis of Hindi-English code mixed data using Grid Search Cross Validation [3.5169472410785367]
We focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis. This work was done as participation in the SemEval-2020 Sentimix Task.
arXiv Detail & Related papers (2020-07-24T15:06:48Z)
IIT Gandhinagar at SemEval-2020 Task 9: Code-Mixed Sentiment Classification Using Candidate Sentence Generation and Selection [1.2301855531996841]
Code-mixing adds to the challenge of analyzing the sentiment of the text due to the non-standard writing style. We present a candidate sentence generation and selection based approach on top of the Bi-LSTM based neural classifier. The proposed approach shows an improvement in the system performance as compared to the Bi-LSTM based neural classifier.
arXiv Detail & Related papers (2020-06-25T14:59:47Z)
A Multi-Perspective Architecture for Semantic Code Search [58.73778219645548]
We propose a novel multi-perspective cross-lingual neural framework for code--text matching. Our experiments on the CoNaLa dataset show that our proposed model yields better performance than previous approaches.
arXiv Detail & Related papers (2020-05-06T04:46:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.