Lex2Sent: A bagging approach to unsupervised sentiment analysis
- URL: http://arxiv.org/abs/2209.13023v1
- Date: Mon, 26 Sep 2022 20:49:18 GMT
- Title: Lex2Sent: A bagging approach to unsupervised sentiment analysis
- Authors: Kai-Robin Lange, Jonas Rieger, Carsten Jentsch
- Abstract summary: The model proposed in this paper, called Lex2Sent, is an unsupervised sentiment analysis method to improve the classification of sentiment lexicon methods.
For three benchmark datasets considered in this paper, the proposed Lex2Sent outperforms every evaluated lexicon.
- Score: 0.42970700836450487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised sentiment analysis is traditionally performed by counting those
words in a text that are stored in a sentiment lexicon and then assigning a
label depending on the proportion of positive and negative words registered.
While these "counting" methods are considered to be beneficial as they rate a
text deterministically, their classification rates decrease when the analyzed
texts are short or the vocabulary differs from what the lexicon considers
default. The model proposed in this paper, called Lex2Sent, is an unsupervised
sentiment analysis method to improve the classification of sentiment lexicon
methods. For this purpose, a Doc2Vec-model is trained to determine the
distances between document embeddings and the embeddings of the positive and
negative part of a sentiment lexicon. These distances are then evaluated for
multiple executions of Doc2Vec on resampled documents and are averaged to
perform the classification task. For three benchmark datasets considered in
this paper, the proposed Lex2Sent outperforms every evaluated lexicon,
including state-of-the-art lexica like VADER or the Opinion Lexicon in terms of
classification rate.
Related papers
- A Comparison of Lexicon-Based and ML-Based Sentiment Analysis: Are There
Outlier Words? [14.816706893177997]
In this paper we compute sentiment for more than 150,000 English language texts drawn from 4 domains.
We model differences in sentiment scores between approaches for documents in each domain using a regression.
Our findings are that the importance of a word depends on the domain and there are no standout lexical entries which systematically cause differences in sentiment scores.
arXiv Detail & Related papers (2023-11-10T18:21:50Z) - Sentiment-Aware Word and Sentence Level Pre-training for Sentiment
Analysis [64.70116276295609]
SentiWSP is a Sentiment-aware pre-trained language model with combined Word-level and Sentence-level Pre-training tasks.
SentiWSP achieves new state-of-the-art performance on various sentence-level and aspect-level sentiment classification benchmarks.
arXiv Detail & Related papers (2022-10-18T12:25:29Z) - LEXpander: applying colexification networks to automated lexicon
expansion [0.16804697591495946]
We present LEXpander, a method for lexicon expansion that leverages novel data on colexification.
We find that LEXpander outperforms existing approaches in terms of both precision and the trade-off between precision and recall of generated word lists.
arXiv Detail & Related papers (2022-05-31T14:55:29Z) - Out-of-Category Document Identification Using Target-Category Names as
Weak Supervision [64.671654559798]
Out-of-category detection aims to distinguish documents according to their semantic relevance to the inlier (or target) categories.
We present an out-of-category detection framework, which effectively measures how confidently each document belongs to one of the target categories.
arXiv Detail & Related papers (2021-11-24T21:01:25Z) - Fine-Grained Opinion Summarization with Minimal Supervision [48.43506393052212]
FineSum aims to profile a target by extracting opinions from multiple documents.
FineSum automatically identifies opinion phrases from the raw corpus, classifies them into different aspects and sentiments, and constructs multiple fine-grained opinion clusters under each aspect/sentiment.
Both automatic evaluation on the benchmark and quantitative human evaluation validate the effectiveness of our approach.
arXiv Detail & Related papers (2021-10-17T15:16:34Z) - LexSubCon: Integrating Knowledge from Lexical Resources into Contextual
Embeddings for Lexical Substitution [76.615287796753]
We introduce LexSubCon, an end-to-end lexical substitution framework based on contextual embedding models.
This is achieved by combining contextual information with knowledge from structured lexical resources.
Our experiments show that LexSubCon outperforms previous state-of-the-art methods on LS07 and CoInCo benchmark datasets.
arXiv Detail & Related papers (2021-07-11T21:25:56Z) - Improving Document-Level Sentiment Classification Using Importance of
Sentences [3.007949058551534]
We propose a document-level sentence classification model based on deep neural networks.
We conduct experiments using the sentiment datasets in the four different domains such as movie reviews, hotel reviews, restaurant reviews, and music reviews.
The experimental results show that the importance of sentences should be considered in a document-level sentiment classification task.
arXiv Detail & Related papers (2021-03-09T01:29:08Z) - Enhanced word embeddings using multi-semantic representation through
lexical chains [1.8199326045904998]
We propose two novel algorithms, called Flexible Lexical Chain II and Fixed Lexical Chain II.
These algorithms combine the semantic relations derived from lexical chains, prior knowledge from lexical databases, and the robustness of the distributional hypothesis in word embeddings as building blocks forming a single system.
Our results show the integration between lexical chains and word embeddings representations sustain state-of-the-art results, even against more complex systems.
arXiv Detail & Related papers (2021-01-22T09:43:33Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - A Variational Approach to Unsupervised Sentiment Analysis [8.87759101018566]
We propose a variational approach to unsupervised sentiment analysis.
We use target-opinion word pairs as a supervision signal.
We apply our method to sentiment analysis on customer reviews and clinical narratives.
arXiv Detail & Related papers (2020-08-21T09:52:35Z) - A computational model implementing subjectivity with the 'Room Theory'.
The case of detecting Emotion from Text [68.8204255655161]
This work introduces a new method to consider subjectivity and general context dependency in text analysis.
By using similarity measure between words, we are able to extract the relative relevance of the elements in the benchmark.
This method could be applied to all the cases where evaluating subjectivity is relevant to understand the relative value or meaning of a text.
arXiv Detail & Related papers (2020-05-12T21:26:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.