gundapusunil at SemEval-2020 Task 9: Syntactic Semantic LSTM
Architecture for SENTIment Analysis of Code-MIXed Data
- URL: http://arxiv.org/abs/2010.04395v1
- Date: Fri, 9 Oct 2020 07:07:04 GMT
- Title: gundapusunil at SemEval-2020 Task 9: Syntactic Semantic LSTM
Architecture for SENTIment Analysis of Code-MIXed Data
- Authors: Sunil Gundapu, Radhika Mamidi
- Abstract summary: We have developed a system for SemEval 2020: Task 9 on Sentiment Analysis for Code-Mixed Social Media Text.
Our system first generates two types of embeddings for the social media text.
- Score: 7.538482310185133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The phenomenon of mixing the vocabulary and syntax of multiple languages
within the same utterance is called Code-Mixing. This is more evident in
multilingual societies. In this paper, we have developed a system for SemEval
2020: Task 9 on Sentiment Analysis for Code-Mixed Social Media Text. Our system
first generates two types of embeddings for the social media text. In those,
the first one is character level embeddings to encode the character level
information and to handle the out-of-vocabulary entries and the second one is
FastText word embeddings for capturing morphology and semantics. These two
embeddings were passed to the LSTM network and the system outperformed the
baseline model.
Related papers
- Tomato, Tomahto, Tomate: Measuring the Role of Shared Semantics among Subwords in Multilingual Language Models [88.07940818022468]
We take an initial step on measuring the role of shared semantics among subwords in the encoder-only multilingual language models (mLMs)
We form "semantic tokens" by merging the semantically similar subwords and their embeddings.
inspections on the grouped subwords show that they exhibit a wide range of semantic similarities.
arXiv Detail & Related papers (2024-11-07T08:38:32Z) - Gloss-free Sign Language Translation: Improving from Visual-Language
Pretraining [56.26550923909137]
Gloss-Free Sign Language Translation (SLT) is a challenging task due to its cross-domain nature.
We propose a novel Gloss-Free SLT based on Visual-Language Pretraining (GFSLT-)
Our approach involves two stages: (i) integrating Contrastive Language-Image Pre-training with masked self-supervised learning to create pre-tasks that bridge the semantic gap between visual and textual representations and restore masked sentences, and (ii) constructing an end-to-end architecture with an encoder-decoder-like structure that inherits the parameters of the pre-trained Visual and Text Decoder from
arXiv Detail & Related papers (2023-07-27T10:59:18Z) - Grammar-Based Grounded Lexicon Learning [68.59500589319023]
G2L2 is a lexicalist approach toward learning a compositional and grounded meaning representation of language.
At the core of G2L2 is a collection of lexicon entries, which map each word to a syntactic type and a neuro-symbolic semantic program.
G2L2 can generalize from small amounts of data to novel compositions of words.
arXiv Detail & Related papers (2022-02-17T18:19:53Z) - HPCC-YNU at SemEval-2020 Task 9: A Bilingual Vector Gating Mechanism for
Sentiment Analysis of Code-Mixed Text [10.057804086733576]
This paper presents a system that uses a bilingual vector gating mechanism for bilingual resources to complete the task.
We achieved fifth place in Spanglish and 19th place in Hinglish.
arXiv Detail & Related papers (2020-10-10T08:02:15Z) - SPLAT: Speech-Language Joint Pre-Training for Spoken Language
Understanding [61.02342238771685]
Spoken language understanding requires a model to analyze input acoustic signal to understand its linguistic content and make predictions.
Various pre-training methods have been proposed to learn rich representations from large-scale unannotated speech and text.
We propose a novel semi-supervised learning framework, SPLAT, to jointly pre-train the speech and language modules.
arXiv Detail & Related papers (2020-10-05T19:29:49Z) - NLP-CIC at SemEval-2020 Task 9: Analysing sentiment in code-switching
language using a simple deep-learning classifier [63.137661897716555]
Code-switching is a phenomenon in which two or more languages are used in the same message.
We use a standard convolutional neural network model to predict the sentiment of tweets in a blend of Spanish and English languages.
arXiv Detail & Related papers (2020-09-07T19:57:09Z) - C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed
Social Media Text using Feature Engineering [0.9646922337783134]
This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix.
We are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks.
arXiv Detail & Related papers (2020-08-09T00:46:26Z) - ULD@NUIG at SemEval-2020 Task 9: Generative Morphemes with an Attention
Model for Sentiment Analysis in Code-Mixed Text [1.4926515182392508]
We present the Generative Morphemes with Attention (GenMA) Model sentiment analysis system contributed to SemEval 2020 Task 9 SentiMix.
The system aims to predict the sentiments of the given English-Hindi code-mixed tweets without using word-level language tags.
arXiv Detail & Related papers (2020-07-27T23:58:54Z) - IUST at SemEval-2020 Task 9: Sentiment Analysis for Code-Mixed Social
Media Text using Deep Neural Networks and Linear Baselines [6.866104126509981]
We develop a system to predict the sentiment of a given code-mixed tweet.
Our best performing method obtains an F1 score of 0.751 for the Spanish-English sub-task and 0.706 over the Hindi-English sub-task.
arXiv Detail & Related papers (2020-07-24T18:48:37Z) - IIT Gandhinagar at SemEval-2020 Task 9: Code-Mixed Sentiment
Classification Using Candidate Sentence Generation and Selection [1.2301855531996841]
Code-mixing adds to the challenge of analyzing the sentiment of the text due to the non-standard writing style.
We present a candidate sentence generation and selection based approach on top of the Bi-LSTM based neural classifier.
The proposed approach shows an improvement in the system performance as compared to the Bi-LSTM based neural classifier.
arXiv Detail & Related papers (2020-06-25T14:59:47Z) - Depth-Adaptive Graph Recurrent Network for Text Classification [71.20237659479703]
Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network.
We propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required.
arXiv Detail & Related papers (2020-02-29T03:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.