Related papers: Emotion Detection with Transformers: A Comparative Study

Emotion Detection with Transformers: A Comparative Study

URL: http://arxiv.org/abs/2403.15454v4
Date: Sat, 27 Jul 2024 17:41:20 GMT
Title: Emotion Detection with Transformers: A Comparative Study
Authors: Mahdi Rezapour,
Abstract summary: We train and evaluate several pre-trained transformer models, on the Emotion dataset using different variants of transformers. Our analysis reveals that commonly applied techniques like removing punctuation and stop words can hinder model performance.
Score: 0.0
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: In this study, we explore the application of transformer-based models for emotion classification on text data. We train and evaluate several pre-trained transformer models, on the Emotion dataset using different variants of transformers. The paper also analyzes some factors that in-fluence the performance of the model, such as the fine-tuning of the transformer layer, the trainability of the layer, and the preprocessing of the text data. Our analysis reveals that commonly applied techniques like removing punctuation and stop words can hinder model performance. This might be because transformers strength lies in understanding contextual relationships within text. Elements like punctuation and stop words can still convey sentiment or emphasis and removing them might disrupt this context.

Related papers

Differential Transformer [99.5117269150629]
Transformer tends to overallocate attention to irrelevant context. We introduce Diff Transformer, which amplifies attention to relevant context while canceling noise. It offers notable advantages in practical applications, such as long-context modeling, key information retrieval, hallucination mitigation, in-context learning, and reduction of activation outliers.
arXiv Detail & Related papers (2024-10-07T17:57:38Z)
iTransformer: Inverted Transformers Are Effective for Time Series Forecasting [62.40166958002558]
We propose iTransformer, which simply applies the attention and feed-forward network on the inverted dimensions. The iTransformer model achieves state-of-the-art on challenging real-world datasets.
arXiv Detail & Related papers (2023-10-10T13:44:09Z)
A Length-Extrapolatable Transformer [98.54835576985664]
We focus on length extrapolation, i.e., training on short texts while evaluating longer sequences. We introduce a relative position embedding to explicitly maximize attention resolution. We evaluate different Transformer variants with language modeling.
arXiv Detail & Related papers (2022-12-20T18:56:20Z)
Transformers learn in-context by gradient descent [58.24152335931036]
Training Transformers on auto-regressive objectives is closely related to gradient-based meta-learning formulations. We show how trained Transformers become mesa-optimizers i.e. learn models by gradient descent in their forward pass.
arXiv Detail & Related papers (2022-12-15T09:21:21Z)
Characterizing Intrinsic Compositionality in Transformers with Tree Projections [72.45375959893218]
neural models like transformers can route information arbitrarily between different parts of their input. We show that transformers for three different tasks become more treelike over the course of training. These trees are predictive of model behavior, with more tree-like models generalizing better on tests of compositional generalization.
arXiv Detail & Related papers (2022-11-02T17:10:07Z)
Improving Attention-Based Interpretability of Text Classification Transformers [7.027858121801477]
We study the effectiveness of attention-based interpretability techniques for transformers in text classification. We show that, with proper setup, attention may be used in such tasks with results comparable to state-of-the-art techniques.
arXiv Detail & Related papers (2022-09-22T09:19:22Z)
Incorporating Residual and Normalization Layers into Analysis of Masked Language Models [29.828669678974983]
We extend the scope of the analysis of Transformers from solely the attention patterns to the whole attention block. Our analysis of Transformer-based masked language models shows that the token-to-token interaction performed via attention has less impact on the intermediate representations than previously assumed.
arXiv Detail & Related papers (2021-09-15T08:32:20Z)
Transformer-F: A Transformer network with effective methods for learning universal sentence representation [8.225067988604351]
The Transformer model is widely used in natural language processing for sentence representation. In this paper, two approaches are introduced to improve the performance of Transformers.
arXiv Detail & Related papers (2021-07-02T03:20:11Z)
Assessing Phrasal Representation and Composition in Transformers [13.460125148455143]
Deep transformer models have pushed performance on NLP tasks to new limits. We present systematic analysis of phrasal representations in state-of-the-art pre-trained transformers. We find that phrase representation in these models relies heavily on word content, with little evidence of nuanced composition.
arXiv Detail & Related papers (2020-10-08T04:59:39Z)
Applying the Transformer to Character-level Transduction [68.91664610425114]
The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks. We show that with a large enough batch size, the transformer does indeed outperform recurrent models for character-level tasks.
arXiv Detail & Related papers (2020-05-20T17:25:43Z)
Hierarchical Transformer Network for Utterance-level Emotion Recognition [0.0]
We address some challenges in utter-ance-level emotion recognition (ULER) Unlike the traditional text classification problem, this task is supported by a limited number of datasets. We use a pretrained language model bidirectional encoder representa-tions from transformers (BERT) as the lower-level transformer. In addition, we add speaker embeddings to the model for the first time, which enables our model to capture the in-teraction between speakers.
arXiv Detail & Related papers (2020-02-18T13:44:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.