Related papers: emojiSpace: Spatial Representation of Emojis

emojiSpace: Spatial Representation of Emojis

URL: http://arxiv.org/abs/2209.09871v1
Date: Mon, 12 Sep 2022 13:57:31 GMT
Title: emojiSpace: Spatial Representation of Emojis
Authors: Moeen Mostafavi, Mahsa Pahlavikhah Varnosfaderani, Fateme Nikseresht, Seyed Ahmad Mansouri
Abstract summary: In this study, we create emojiSpace, which is a combined word-emoji embedding using the word2vec model from the Genism library in Python. We trained emojiSpace on a corpus of more than 4 billion tweets and evaluated it by implementing sentiment analysis on a Twitter dataset containing more than 67 million tweets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the absence of nonverbal cues during messaging communication, users express part of their emotions using emojis. Thus, having emojis in the vocabulary of text messaging language models can significantly improve many natural language processing (NLP) applications such as online communication analysis. On the other hand, word embedding models are usually trained on a very large corpus of text such as Wikipedia or Google News datasets that include very few samples with emojis. In this study, we create emojiSpace, which is a combined word-emoji embedding using the word2vec model from the Genism library in Python. We trained emojiSpace on a corpus of more than 4 billion tweets and evaluated it by implementing sentiment analysis on a Twitter dataset containing more than 67 million tweets as an extrinsic task. For this task, we compared the performance of two different classifiers of random forest (RF) and linear support vector machine (SVM). For evaluation, we compared emojiSpace performance with two other pre-trained embeddings and demonstrated that emojiSpace outperforms both.

Related papers

The Prosody of Emojis [73.70220975424597]
This study examines how emojis influence prosodic realisation in speech and how listeners interpret prosodic cues to recover emoji meanings.<n>Unlike previous work, we directly link prosody and emoji by analysing actual human speech data, collected through structured but open-ended production and perception tasks.<n>Results show that speakers adapt their prosody based on emoji cues, listeners can often identify the intended emoji from prosodic variation alone, and greater semantic differences between emojis correspond to increased prosodic divergence.
arXiv Detail & Related papers (2025-08-01T11:24:12Z)
Semantics Preserving Emoji Recommendation with Large Language Models [47.94761630160614]
Existing emoji recommendation methods are primarily evaluated based on their ability to match the exact emoji a user chooses in the original text. We propose a new semantics preserving evaluation framework for emoji recommendation, which measures a model's ability to recommend emojis that maintain the semantic consistency with the user's text.
arXiv Detail & Related papers (2024-09-16T22:27:46Z)
T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text [59.57676466961787]
We propose a novel dynamic vector quantization (DVA-VAE) model that can adjust the encoding length based on the information density in sign language. Experiments conducted on the PHOENIX14T dataset demonstrate the effectiveness of our proposed method. We propose a new large German sign language dataset, PHOENIX-News, which contains 486 hours of sign language videos, audio, and transcription texts.
arXiv Detail & Related papers (2024-06-11T10:06:53Z)
EmojiLM: Modeling the New Emoji Language [44.23076273155259]
We develop a text-emoji parallel corpus, Text2Emoji, from a large language model. Based on the parallel corpus, we distill a sequence-to-sequence model, EmojiLM, which is specialized in the text-emoji bidirectional translation. Our proposed model outperforms strong baselines and the parallel corpus benefits emoji-related downstream tasks.
arXiv Detail & Related papers (2023-11-03T07:06:51Z)
Emoji Prediction in Tweets using BERT [0.0]
We propose a transformer-based approach for emoji prediction using BERT, a widely-used pre-trained language model. We fine-tuned BERT on a large corpus of text (tweets) containing both text and emojis to predict the most appropriate emoji for a given text. Our experimental results demonstrate that our approach outperforms several state-of-the-art models in predicting emojis with an accuracy of over 75 percent.
arXiv Detail & Related papers (2023-07-05T06:38:52Z)
Emojich -- zero-shot emoji generation using Russian language: a technical report [52.77024349608834]
"Emojich" is a text-to-image neural network that generates emojis using captions in Russian language as a condition. We aim to keep the generalization ability of a pretrained big model ruDALL-E Malevich (XL) 1.3B parameters at the fine-tuning stage.
arXiv Detail & Related papers (2021-12-04T23:37:32Z)
Emoji-aware Co-attention Network with EmoGraph2vec Model for Sentiment Anaylsis [9.447106020795292]
We propose a method to learn emoji representations called EmoGraph2vec and design an emoji-aware co-attention network. Our model designs a co-attention mechanism to incorporate the text and emojis, and integrates a squeeze-and-excitation block into a convolutional neural network. Experimental results show that the proposed model can outperform several baselines for sentiment analysis on benchmark datasets.
arXiv Detail & Related papers (2021-10-27T08:01:10Z)
Sentiment analysis in tweets: an assessment study from classical to modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information. Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks. This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z)
Semantic Journeys: Quantifying Change in Emoji Meaning from 2012-2018 [66.28665205489845]
We offer the first longitudinal study of how emoji semantics changes over time, applying techniques from computational linguistics to six years of Twitter data. We identify five patterns in emoji semantic development and find evidence that the less abstract an emoji is, the more likely it is to undergo semantic change. To aid future work on emoji and semantics, we make our data publicly available along with a web-based interface that anyone can use to explore semantic change in emoji.
arXiv Detail & Related papers (2021-05-03T13:35:10Z)
A `Sourceful' Twist: Emoji Prediction Based on Sentiment, Hashtags and Application Source [1.6818451361240172]
We showcase the importance of using Twitter features to help the model understand the sentiment involved and hence to predict the most suitable emoji for the text. Our data analysis and neural network model performance evaluations depict that using hashtags and application sources as features allows to encode different information and is effective in emoji prediction.
arXiv Detail & Related papers (2021-03-14T03:05:04Z)
Emoji Prediction: Extensions and Benchmarking [30.642840676899734]
The emoji prediction task aims at predicting the proper set of emojis associated with a piece of text. We extend the existing setting of the emoji prediction task to include a richer set of emojis and to allow multi-label classification. We propose novel models for multi-class and multi-label emoji prediction based on Transformer networks.
arXiv Detail & Related papers (2020-07-14T22:41:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.