Emotion Classification In-Context in Spanish
- URL: http://arxiv.org/abs/2505.20571v1
- Date: Mon, 26 May 2025 23:09:41 GMT
- Title: Emotion Classification In-Context in Spanish
- Authors: Bipul Thapa, Gabriel Cofre,
- Abstract summary: We classify customer feedback in Spanish into three emotion categories--positive, neutral, and negative--using advanced NLP and ML techniques.<n>Traditional methods translate feedback from widely spoken languages to less common ones, resulting in a loss of semantic integrity.<n>We propose a hybrid approach that combines TF-IDF with BERT embeddings, effectively transforming Spanish text into rich numerical representations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Classifying customer feedback into distinct emotion categories is essential for understanding sentiment and improving customer experience. In this paper, we classify customer feedback in Spanish into three emotion categories--positive, neutral, and negative--using advanced NLP and ML techniques. Traditional methods translate feedback from widely spoken languages to less common ones, resulting in a loss of semantic integrity and contextual nuances inherent to the original language. To address this limitation, we propose a hybrid approach that combines TF-IDF with BERT embeddings, effectively transforming Spanish text into rich numerical representations that preserve the semantic depth of the original language by using a Custom Stacking Ensemble (CSE) approach. To evaluate emotion classification, we utilize a range of models, including Logistic Regression, KNN, Bagging classifier with LGBM, and AdaBoost. The CSE model combines these classifiers as base models and uses a one-vs-all Logistic Regression as the meta-model. Our experimental results demonstrate that CSE significantly outperforms the individual and BERT model, achieving a test accuracy of 93.3% on the native Spanish dataset--higher than the accuracy obtained from the translated version. These findings underscore the challenges of emotion classification in Spanish and highlight the advantages of combining vectorization techniques like TF-IDF with BERT for improved accuracy. Our results provide valuable insights for businesses seeking to leverage emotion classification to enhance customer feedback analysis and service improvements.
Related papers
- Enhancing Multilingual Sentiment Analysis with Explainability for Sinhala, English, and Code-Mixed Content [0.0]
Existing models struggle with low-resource languages like Sinhala and lack interpretability for practical use.<n>This research develops a hybrid aspect-based sentiment analysis framework that enhances multilingual capabilities with explainable outputs.
arXiv Detail & Related papers (2025-04-18T08:21:12Z) - Enhancing Multilingual ASR for Unseen Languages via Language Embedding Modeling [50.62091603179394]
Whisper, one of the most advanced ASR models, handles 99 languages effectively.<n>However, Whisper struggles with unseen languages, those not included in its pre-training.<n>We propose methods that exploit these relationships to enhance ASR performance on unseen languages.
arXiv Detail & Related papers (2024-12-21T04:05:43Z) - Improving Arabic Multi-Label Emotion Classification using Stacked Embeddings and Hybrid Loss Function [4.149971421068989]
This study uses stacked embeddings, meta-learning, and a hybrid loss function to enhance multi-label emotion classification for the Arabic language.
To further improve performance, a hybrid loss function is introduced, incorporating class weighting, label correlation, and contrastive learning.
Experiments validate the proposed model's performance across key metrics such as Precision, Recall, F1-Score, Jaccard Accuracy, and Hamming Loss.
arXiv Detail & Related papers (2024-10-04T23:37:21Z) - A User-Centered Evaluation of Spanish Text Simplification [6.046875672600245]
We present an evaluation of text simplification (TS) in Spanish for a production system.
We compare the most prevalent Spanish-specific readability scores with neural networks, and show that the latter are consistently better at predicting user preferences regarding TS.
We release the corpora in our evaluation to the broader community with the hopes of pushing forward the state-of-the-art in Spanish natural language processing.
arXiv Detail & Related papers (2023-08-15T03:49:59Z) - T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text
Classification [50.675552118811]
Cross-lingual text classification is typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest.
We propose revisiting the classic "translate-and-test" pipeline to neatly separate the translation and classification stages.
arXiv Detail & Related papers (2023-06-08T07:33:22Z) - REDAffectiveLM: Leveraging Affect Enriched Embedding and
Transformer-based Neural Language Model for Readers' Emotion Detection [3.6678641723285446]
We propose a novel approach for Readers' Emotion Detection from short-text documents using a deep learning model called REDAffectiveLM.
We leverage context-specific and affect enriched representations by using a transformer-based pre-trained language model in tandem with affect enriched Bi-LSTM+Attention.
arXiv Detail & Related papers (2023-01-21T19:28:25Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Linear Transformations for Cross-lingual Sentiment Analysis [5.161531917413708]
We perform zero-shot cross-lingual classification using five linear transformations combined with LSTM and CNN based classifiers.
We show that the pre-trained embeddings from the target domain are crucial to improving the cross-lingual classification results.
arXiv Detail & Related papers (2022-09-15T12:27:16Z) - A New Generation of Perspective API: Efficient Multilingual
Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw.
At the heart of the approach is a single multilingual token-free Charformer model.
We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.