Multi-Label Sentiment Analysis on 100 Languages with Dynamic Weighting
for Label Imbalance
- URL: http://arxiv.org/abs/2008.11573v1
- Date: Wed, 26 Aug 2020 14:16:02 GMT
- Title: Multi-Label Sentiment Analysis on 100 Languages with Dynamic Weighting
for Label Imbalance
- Authors: Selim F. Yilmaz, E. Batuhan Kaynak, Aykut Ko\c{c}, Hamdi
Dibeklio\u{g}lu and Suleyman S. Kozat
- Abstract summary: Cross-lingual sentiment analysis has attracted significant attention due to its applications in various areas including market research, politics and social sciences.
We introduce a sentiment analysis framework in multi-label setting as it obeys Plutchik wheel of emotions.
We show that our method obtains the state-of-the-art performance in 7 of 9 metrics in 3 different languages.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate cross-lingual sentiment analysis, which has attracted
significant attention due to its applications in various areas including market
research, politics and social sciences. In particular, we introduce a sentiment
analysis framework in multi-label setting as it obeys Plutchik wheel of
emotions. We introduce a novel dynamic weighting method that balances the
contribution from each class during training, unlike previous static weighting
methods that assign non-changing weights based on their class frequency.
Moreover, we adapt the focal loss that favors harder instances from
single-label object recognition literature to our multi-label setting.
Furthermore, we derive a method to choose optimal class-specific thresholds
that maximize the macro-f1 score in linear time complexity. Through an
extensive set of experiments, we show that our method obtains the
state-of-the-art performance in 7 of 9 metrics in 3 different languages using a
single model compared to the common baselines and the best-performing methods
in the SemEval competition. We publicly share our code for our model, which can
perform sentiment analysis in 100 languages, to facilitate further research.
Related papers
- LC-Protonets: Multi-label Few-shot learning for world music audio tagging [65.72891334156706]
We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the problem of multi-label few-shot classification.
LC-Protonets generate one prototype per label combination, derived from the power set of labels present in the limited training items.
Our method is applied to automatic audio tagging across diverse music datasets, covering various cultures and including both modern and traditional music.
arXiv Detail & Related papers (2024-09-17T15:13:07Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - AX-MABSA: A Framework for Extremely Weakly Supervised Multi-label Aspect
Based Sentiment Analysis [8.067010122141985]
We present an extremely weakly supervised multi-label Aspect Category Sentiment Analysis framework.
We only rely on a single word per class as an initial indicative information.
We propose an automatic word selection technique to choose these seed categories and sentiment words.
arXiv Detail & Related papers (2022-11-07T19:44:42Z) - Sentiment Classification of Code-Switched Text using Pre-trained
Multilingual Embeddings and Segmentation [1.290382979353427]
We propose a multi-step natural language processing algorithm for code-switched sentiment analysis.
The proposed algorithm can be expanded for sentiment analysis of multiple languages with limited human expertise.
arXiv Detail & Related papers (2022-10-29T01:52:25Z) - PercentMatch: Percentile-based Dynamic Thresholding for Multi-Label
Semi-Supervised Classification [64.39761523935613]
We propose a percentile-based threshold adjusting scheme to dynamically alter the score thresholds of positive and negative pseudo-labels for each class during the training.
We achieve strong performance on Pascal VOC2007 and MS-COCO datasets when compared to recent SSL methods.
arXiv Detail & Related papers (2022-08-30T01:27:48Z) - MetaAudio: A Few-Shot Audio Classification Benchmark [2.294014185517203]
This work aims to alleviate this reliance on image-based benchmarks by offering the first comprehensive, public and fully reproducible audio based alternative.
We compare the few-shot classification performance of a variety of techniques on seven audio datasets.
Our experimentation shows gradient-based meta-learning methods such as MAML and Meta-Curvature consistently outperform both metric and baseline methods.
arXiv Detail & Related papers (2022-04-05T11:33:44Z) - Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural
Machine Translation Training [58.72619374790418]
MultiUAT dynamically adjusts the training data usage based on the model's uncertainty.
We analyze the cross-domain transfer and show the deficiency of static and similarity based methods.
arXiv Detail & Related papers (2021-09-06T08:30:33Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z) - A Sample Selection Approach for Universal Domain Adaptation [94.80212602202518]
We study the problem of unsupervised domain adaption in the universal scenario.
Only some of the classes are shared between the source and target domains.
We present a scoring scheme that is effective in identifying the samples of the shared classes.
arXiv Detail & Related papers (2020-01-14T22:28:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.