Identifying Offensive Expressions of Opinion in Context
- URL: http://arxiv.org/abs/2104.12227v2
- Date: Tue, 27 Apr 2021 09:49:41 GMT
- Title: Identifying Offensive Expressions of Opinion in Context
- Authors: Francielle Alves Vargas, Isabelle Carvalho, Fabiana Rodrigues de
G\'oes
- Abstract summary: It is still a challenge to subjective information extraction systems to identify opinions and feelings in context.
In sentiment-based NLP tasks, there are few resources to information extraction, above all offensive or hateful opinions in context.
This paper provides a new cross-lingual and contextual offensive lexicon, which consists of explicit and implicit offensive and swearing expressions of opinion.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Classic information extraction techniques consist in building questions and
answers about the facts. Indeed, it is still a challenge to subjective
information extraction systems to identify opinions and feelings in context. In
sentiment-based NLP tasks, there are few resources to information extraction,
above all offensive or hateful opinions in context. To fill this important gap,
this short paper provides a new cross-lingual and contextual offensive lexicon,
which consists of explicit and implicit offensive and swearing expressions of
opinion, which were annotated in two different classes: context dependent and
context-independent offensive. In addition, we provide markers to identify hate
speech. Annotation approach was evaluated at the expression-level and achieves
high human inter-annotator agreement. The provided offensive lexicon is
available in Portuguese and English languages.
Related papers
- Quantifying the redundancy between prosody and text [67.07817268372743]
We use large language models to estimate how much information is redundant between prosody and the words themselves.
We find a high degree of redundancy between the information carried by the words and prosodic information across several prosodic features.
Still, we observe that prosodic features can not be fully predicted from text, suggesting that prosody carries information above and beyond the words.
arXiv Detail & Related papers (2023-11-28T21:15:24Z) - COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive
Statements [30.1056760312051]
We introduce COBRA frames, the first context-aware formalism for explaining the intents, reactions, and harms of offensive or biased statements.
We create COBRACORPUS, a dataset of 33k potentially offensive statements paired with machine-generated contexts.
We find that explanations by context-agnostic models are significantly worse than by context-aware ones.
arXiv Detail & Related papers (2023-06-03T02:47:24Z) - Natural Language Decompositions of Implicit Content Enable Better Text
Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.
We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.
Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - Hate Speech Criteria: A Modular Approach to Task-Specific Hate Speech
Definitions [1.3274508420845537]
We present textithate speech criteria, developed with perspectives from law and social science.
We argue that the goal and exact task developers have in mind should determine how the scope of textithate speech is defined.
arXiv Detail & Related papers (2022-06-30T17:50:16Z) - Rethinking Offensive Text Detection as a Multi-Hop Reasoning Problem [15.476899850339395]
We introduce the task of implicit offensive text detection in dialogues.
We argue that reasoning is crucial for understanding this broader class of offensive utterances.
We release SLIGHT, a dataset to support research on this task.
arXiv Detail & Related papers (2022-04-22T06:20:15Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - Fine-Grained Opinion Summarization with Minimal Supervision [48.43506393052212]
FineSum aims to profile a target by extracting opinions from multiple documents.
FineSum automatically identifies opinion phrases from the raw corpus, classifies them into different aspects and sentiments, and constructs multiple fine-grained opinion clusters under each aspect/sentiment.
Both automatic evaluation on the benchmark and quantitative human evaluation validate the effectiveness of our approach.
arXiv Detail & Related papers (2021-10-17T15:16:34Z) - Do Context-Aware Translation Models Pay the Right Attention? [61.25804242929533]
Context-aware machine translation models are designed to leverage contextual information, but often fail to do so.
In this paper, we ask several questions: What contexts do human translators use to resolve ambiguous words?
We introduce SCAT (Supporting Context for Ambiguous Translations), a new English-French dataset comprising supporting context words for 14K translations.
Using SCAT, we perform an in-depth analysis of the context used to disambiguate, examining positional and lexical characteristics of the supporting words.
arXiv Detail & Related papers (2021-05-14T17:32:24Z) - Contextual Lexicon-Based Approach for Hate Speech and Offensive Language
Detection [1.1744028458220426]
This paper presents a new approach for offensive language and hate speech detection on social media.
Our approach incorporates an offensive lexicon composed by implicit and explicit offensive and swearing expressions annotated with binary classes.
Due to the severity of the hate speech and offensive comments in Brazil and the lack of research in Portuguese, Brazilian Portuguese is the language used to validate our method.
arXiv Detail & Related papers (2021-04-25T21:34:51Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - A Unified Feature Representation for Lexical Connotations [13.153001795077227]
We use distant labeling to create a new lexical resource representing connotation aspects for nouns and adjectives.
Our analysis shows that it aligns well with human judgments.
arXiv Detail & Related papers (2020-05-31T23:14:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.