Continuous sentiment scores for literary and multilingual contexts
- URL: http://arxiv.org/abs/2508.14620v1
- Date: Wed, 20 Aug 2025 11:18:13 GMT
- Title: Continuous sentiment scores for literary and multilingual contexts
- Authors: Laurits Lyngbaek, Pascale Feldkamp, Yuri Bizzoni, Kristoffer Nielbo, Kenneth Enevoldsen,
- Abstract summary: We introduce a novel continuous sentiment scoring method based on concept vector projection, trained on multilingual literary data.<n>Our approach outperforms existing tools on English and Danish texts, producing sentiment scores whose distribution closely matches human ratings.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Sentiment Analysis is widely used to quantify sentiment in text, but its application to literary texts poses unique challenges due to figurative language, stylistic ambiguity, as well as sentiment evocation strategies. Traditional dictionary-based tools often underperform, especially for low-resource languages, and transformer models, while promising, typically output coarse categorical labels that limit fine-grained analysis. We introduce a novel continuous sentiment scoring method based on concept vector projection, trained on multilingual literary data, which more effectively captures nuanced sentiment expressions across genres, languages, and historical periods. Our approach outperforms existing tools on English and Danish texts, producing sentiment scores whose distribution closely matches human ratings, enabling more accurate analysis and sentiment arc modeling in literature.
Related papers
- Ensembling Multilingual Transformers for Robust Sentiment Analysis of Tweets [0.0]
We present a transformer ensemble model and a large language model (LLM) that employs sentiment analysis of other languages.<n> Sentiment was then assessed for sentences using an ensemble of pre-trained sentiment analysis models: bert-base-multilingual-uncased-sentiment, and XLM-R.<n>Our experimental results indicated that sentiment analysis performance was more than 86% using the proposed method.
arXiv Detail & Related papers (2025-09-28T21:34:48Z) - Multilingual Sentiment Analysis of Summarized Texts: A Cross-Language Study of Text Shortening Effects [42.90274643419224]
Summarization significantly impacts sentiment analysis across languages with diverse morphologies.<n>This study examines extractive and abstractive summarization effects on sentiment classification in English, German, French, Spanish, Italian, Finnish, Hungarian, and Arabic.
arXiv Detail & Related papers (2025-03-31T22:16:04Z) - BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages [93.92804151830744]
We present BRIGHTER, a collection of multi-labeled, emotion-annotated datasets in 28 different languages.<n>We highlight the challenges related to the data collection and annotation processes.<n>We show that the BRIGHTER datasets represent a meaningful step towards addressing the gap in text-based emotion recognition.
arXiv Detail & Related papers (2025-02-17T15:39:50Z) - MASIVE: Open-Ended Affective State Identification in English and Spanish [10.41502827362741]
In this work, we broaden our scope to a practically unbounded set of textitaffective states, which includes any terms that humans use to describe their experiences of feeling.
We collect and publish MASIVE, a dataset of Reddit posts in English and Spanish containing over 1,000 unique affective states each.
On this task, we find that smaller finetuned multilingual models outperform much larger LLMs, even on region-specific Spanish affective states.
arXiv Detail & Related papers (2024-07-16T21:43:47Z) - Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey [66.166184609616]
ChatGPT has opened up immense potential for applying large language models (LLMs) to text-centric multimodal tasks.
It is still unclear how existing LLMs can adapt better to text-centric multimodal sentiment analysis tasks.
arXiv Detail & Related papers (2024-06-12T10:36:27Z) - SenteCon: Leveraging Lexicons to Learn Human-Interpretable Language
Representations [51.08119762844217]
SenteCon is a method for introducing human interpretability in deep language representations.
We show that SenteCon provides high-level interpretability at little to no cost to predictive performance on downstream tasks.
arXiv Detail & Related papers (2023-05-24T05:06:28Z) - Testing the Ability of Language Models to Interpret Figurative Language [69.59943454934799]
Figurative and metaphorical language are commonplace in discourse.
It remains an open question to what extent modern language models can interpret nonliteral phrases.
We introduce Fig-QA, a Winograd-style nonliteral language understanding task.
arXiv Detail & Related papers (2022-04-26T23:42:22Z) - Sentiment analysis in tweets: an assessment study from classical to
modern text representation models [59.107260266206445]
Short texts published on Twitter have earned significant attention as a rich source of information.
Their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks.
This study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets.
arXiv Detail & Related papers (2021-05-29T21:05:28Z) - AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages
with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context.
It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts.
Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z) - Sentiment Analysis with Contextual Embeddings and Self-Attention [3.0079490585515343]
In natural language the intended meaning of a word or phrase is often implicit and depends on the context.
We propose a simple yet effective method for sentiment analysis using contextual embeddings and a self-attention mechanism.
The experimental results for three languages, including morphologically rich Polish and German, show that our model is comparable to or even outperforms state-of-the-art models.
arXiv Detail & Related papers (2020-03-12T02:19:51Z) - A Common Semantic Space for Monolingual and Cross-Lingual
Meta-Embeddings [10.871587311621974]
This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings.
Existing word vectors are projected to a common semantic space using linear transformations and averaging.
The resulting cross-lingual meta-embeddings also exhibit excellent cross-lingual transfer learning capabilities.
arXiv Detail & Related papers (2020-01-17T15:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.