FeelsGoodMan: Inferring Semantics of Twitch Neologisms
- URL: http://arxiv.org/abs/2108.08411v1
- Date: Wed, 18 Aug 2021 23:46:46 GMT
- Title: FeelsGoodMan: Inferring Semantics of Twitch Neologisms
- Authors: Pavel Dolin, Luc d'Hauthuille, Andrea Vattani
- Abstract summary: There are 8.06 million emotes, over 400k of which were used in the week studied.
There is virtually no information on the meaning or sentiment of emotes, and with a constant influx of new emotes and drift in their frequencies, it becomes impossible to maintain an updated manually-labeled dataset.
We introduce a simple but powerful unsupervised framework based on word embeddings and k-NN to enrich existing models with out-of-vocabulary knowledge.
This framework allows us to auto-generate a pseudo-dictionary of emotes and we show that we can nearly match the supervised benchmark above even when injecting such
- Score: 0.7734726150561088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Twitch chats pose a unique problem in natural language understanding due to a
large presence of neologisms, specifically emotes. There are a total of 8.06
million emotes, over 400k of which were used in the week studied. There is
virtually no information on the meaning or sentiment of emotes, and with a
constant influx of new emotes and drift in their frequencies, it becomes
impossible to maintain an updated manually-labeled dataset. Our paper makes a
two fold contribution. First we establish a new baseline for sentiment analysis
on Twitch data, outperforming the previous supervised benchmark by 7.9% points.
Secondly, we introduce a simple but powerful unsupervised framework based on
word embeddings and k-NN to enrich existing models with out-of-vocabulary
knowledge. This framework allows us to auto-generate a pseudo-dictionary of
emotes and we show that we can nearly match the supervised benchmark above even
when injecting such emote knowledge into sentiment classifiers trained on
extraneous datasets such as movie reviews or Twitter.
Related papers
- Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification [56.974545305472304]
Most datasets for sentiment analysis lack context in which an opinion was expressed, often crucial for emotion understanding, and are mainly limited by a few emotion categories.
We design an LLM-based data synthesis pipeline and leverage a large model, Mistral-7b, for the generation of training examples for more accessible, lightweight BERT-type encoder models.
We show that Emo Pillars models are highly adaptive to new domains when tuned to specific tasks such as GoEmotions, ISEAR, IEMOCAP, and EmoContext, reaching the SOTA performance on the first three.
arXiv Detail & Related papers (2025-04-23T16:23:17Z) - EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting [48.56693150755667]
EmoVoice is a novel emotion-controllable TTS model that exploits large language models (LLMs) to enable fine-grained freestyle natural language emotion control.
EmoVoice-DB is a high-quality 40-hour English emotion dataset featuring expressive speech and fine-grained emotion labels with natural language descriptions.
arXiv Detail & Related papers (2025-04-17T11:50:04Z) - Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech [51.486112860259595]
EmoCtrl-TTS is an emotion-controllable zero-shot TTS that can generate highly emotional speech with NVs for any speaker.
To achieve high-quality emotional speech generation, EmoCtrl-TTS is trained using more than 27,000 hours of expressive data curated based on pseudo-labeling.
arXiv Detail & Related papers (2024-07-17T00:54:15Z) - Enhancing Emotion Prediction in News Headlines: Insights from ChatGPT and Seq2Seq Models for Free-Text Generation [19.520889098893395]
We use people's explanations of their emotion, written in free-text, on how they feel after reading a news headline.
For emotion classification, the free-text explanations have a strong correlation with the dominant emotion elicited by the headlines.
Using McNemar's significance test, methods that incorporate GPT-generated free-text explanations demonstrated significant improvement.
arXiv Detail & Related papers (2024-07-14T06:04:11Z) - ChatAnything: Facetime Chat with LLM-Enhanced Personas [87.76804680223003]
We propose the mixture of voices (MoV) and the mixture of diffusers (MoD) for diverse voice and appearance generation.
For MoV, we utilize the text-to-speech (TTS) algorithms with a variety of pre-defined tones.
MoD, we combine the recent popular text-to-image generation techniques and talking head algorithms to streamline the process of generating talking objects.
arXiv Detail & Related papers (2023-11-12T08:29:41Z) - Emotional Speech-Driven Animation with Content-Emotion Disentanglement [51.34635009347183]
We propose EMOTE, which generates 3D talking-head avatars that maintain lip-sync from speech while enabling explicit control over the expression of emotion.
EmOTE produces speech-driven facial animations with better lip-sync than state-of-the-art methods trained on the same data.
arXiv Detail & Related papers (2023-06-15T09:31:31Z) - Unsupervised Extractive Summarization of Emotion Triggers [56.50078267340738]
We develop new unsupervised learning models that can jointly detect emotions and summarize their triggers.
Our best approach, entitled Emotion-Aware Pagerank, incorporates emotion information from external sources combined with a language understanding module.
arXiv Detail & Related papers (2023-06-02T11:07:13Z) - Controlling Personality Style in Dialogue with Zero-Shot Prompt-Based
Learning [0.0]
We explore the performance of prompt-based learning for simultaneously controlling the personality and the semantic accuracy of an NLG for task-oriented dialogue.
We generate semantically and stylistically-controlled text for 5 different Big-5 personality types.
We also test whether NLG personality demonstrations can be used with meaning representations for the video game domain.
arXiv Detail & Related papers (2023-02-08T02:45:21Z) - MAFW: A Large-scale, Multi-modal, Compound Affective Database for
Dynamic Facial Expression Recognition in the Wild [56.61912265155151]
We propose MAFW, a large-scale compound affective database with 10,045 video-audio clips in the wild.
Each clip is annotated with a compound emotional category and a couple of sentences that describe the subjects' affective behaviors in the clip.
For the compound emotion annotation, each clip is categorized into one or more of the 11 widely-used emotions, i.e., anger, disgust, fear, happiness, neutral, sadness, surprise, contempt, anxiety, helplessness, and disappointment.
arXiv Detail & Related papers (2022-08-01T13:34:33Z) - Learning Unseen Emotions from Gestures via Semantically-Conditioned
Zero-Shot Perception with Adversarial Autoencoders [25.774235606472875]
We introduce an adversarial, autoencoder-based representation learning that correlates 3D motion-captured gesture sequence with the vectorized representation of the natural-language perceived emotion terms.
We train our method using a combination of gestures annotated with known emotion terms and gestures not annotated with any emotions.
arXiv Detail & Related papers (2020-09-18T15:59:44Z) - Moment-to-moment Engagement Prediction through the Eyes of the Observer:
PUBG Streaming on Twitch [0.9281671380673304]
We build prediction models for viewers' engagement based on data collected from the popular battle royale game PlayerUnknown's Battlegrounds.
In particular, we collect viewers' chat logs and in-game telemetry data from several hundred matches of five popular streamers.
Our key findings showcase that engagement models trained solely on 40 gameplay features can reach accuracies of up to 80% on average and 84% at best.
arXiv Detail & Related papers (2020-08-17T10:40:34Z) - BAKSA at SemEval-2020 Task 9: Bolstering CNN with Self-Attention for
Sentiment Analysis of Code Mixed Text [4.456122555367167]
We present an ensemble architecture of convolutional neural net (CNN) and self-attention based LSTM for sentiment analysis of code-mixed tweets.
We achieved F1 scores of 0.707 and 0.725 on Hindi-English (Hinglish) and Spanish-English (Spanglish) datasets, respectively.
arXiv Detail & Related papers (2020-07-21T14:05:51Z) - Detecting Perceived Emotions in Hurricane Disasters [62.760131661847986]
We introduce HurricaneEmo, an emotion dataset of 15,000 English tweets spanning three hurricanes: Harvey, Irma, and Maria.
We present a comprehensive study of fine-grained emotions and propose classification tasks to discriminate between coarse-grained emotion groups.
arXiv Detail & Related papers (2020-04-29T16:17:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.