Does BERT look at sentiment lexicon?
- URL: http://arxiv.org/abs/2111.10100v1
- Date: Fri, 19 Nov 2021 08:50:48 GMT
- Title: Does BERT look at sentiment lexicon?
- Authors: Elena Razova, Sergey Vychegzhanin, Evgeny Kotelnikov
- Abstract summary: We study the attention weights matrices of the Russian-language RuBERT model.
We fine-tune RuBERT on sentiment text corpora and compare the distributions of attention weights for sentiment and neutral lexicons.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The main approaches to sentiment analysis are rule-based methods and ma-chine
learning, in particular, deep neural network models with the Trans-former
architecture, including BERT. The performance of neural network models in the
tasks of sentiment analysis is superior to the performance of rule-based
methods. The reasons for this situation remain unclear due to the poor
interpretability of deep neural network models. One of the main keys to
understanding the fundamental differences between the two approaches is the
analysis of how sentiment lexicon is taken into account in neural network
models. To this end, we study the attention weights matrices of the
Russian-language RuBERT model. We fine-tune RuBERT on sentiment text corpora
and compare the distributions of attention weights for sentiment and neutral
lexicons. It turns out that, on average, 3/4 of the heads of various model
var-iants statistically pay more attention to the sentiment lexicon compared to
the neutral one.
Related papers
- Bias-Free Sentiment Analysis through Semantic Blinding and Graph Neural Networks [0.0]
The SProp GNN relies exclusively on syntactic structures and word-level emotional cues to predict emotions in text.
By semantically blinding the model to information about specific words, it is robust to biases such as political or gender bias.
The SProp GNN shows performance superior to lexicon-based alternatives on two different prediction tasks, and across two languages.
arXiv Detail & Related papers (2024-11-19T13:23:53Z) - Cognitive Networks and Performance Drive fMRI-Based State Classification Using DNN Models [0.0]
We employ two structurally different and complementary DNN-based models to classify individual cognitive states.
We show that despite the architectural differences, both models consistently produce a robust relationship between prediction accuracy and individual cognitive performance.
arXiv Detail & Related papers (2024-08-14T15:25:51Z) - Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Example Forgetting: A Novel Approach to Explain and Interpret Deep
Neural Networks in Seismic Interpretation [12.653673008542155]
deep neural networks are an attractive component for the common interpretation pipeline.
Deep neural networks are frequently met with distrust due to their property of producing semantically incorrect outputs when exposed to sections the model was not trained on.
We introduce a method that effectively relates semantically malfunctioned predictions to their respectful positions within the neural network representation manifold.
arXiv Detail & Related papers (2023-02-24T19:19:22Z) - Multitasking Models are Robust to Structural Failure: A Neural Model for
Bilingual Cognitive Reserve [78.3500985535601]
We find a surprising connection between multitask learning and robustness to neuron failures.
Our experiments show that bilingual language models retain higher performance under various neuron perturbations.
We provide a theoretical justification for this robustness by mathematically analyzing linear representation learning.
arXiv Detail & Related papers (2022-10-20T22:23:27Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Lexicon-based Methods vs. BERT for Text Sentiment Analysis [0.15293427903448023]
SO-CAL and SentiStrength lexicon-based methods adapted for the Russian language.
RuBERT outperforms both lexicon-based methods on average, but SO-CAL surpasses RuBERT for four corpora out of 16.
arXiv Detail & Related papers (2021-11-19T08:47:32Z) - Enhanced Aspect-Based Sentiment Analysis Models with Progressive
Self-supervised Attention Learning [103.0064298630794]
In aspect-based sentiment analysis (ABSA), many neural models are equipped with an attention mechanism to quantify the contribution of each context word to sentiment prediction.
We propose a progressive self-supervised attention learning approach for attentional ABSA models.
We integrate the proposed approach into three state-of-the-art neural ABSA models.
arXiv Detail & Related papers (2021-03-05T02:50:05Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Effect of Word Embedding Models on Hate and Offensive Speech Detection [1.7403133838762446]
We investigate the impact of both word embedding models and neural network architectures on the predictive accuracy.
We first train several word embedding models on a large-scale unlabelled Arabic text corpus.
For each detection task, we train several neural network classifiers using the pre-trained word embedding models.
This task yields a large number of various learned models, which allows conducting an exhaustive comparison.
arXiv Detail & Related papers (2020-11-23T02:43:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.