Abuse is Contextual, What about NLP? The Role of Context in Abusive
Language Annotation and Detection
- URL: http://arxiv.org/abs/2103.14916v1
- Date: Sat, 27 Mar 2021 14:31:52 GMT
- Title: Abuse is Contextual, What about NLP? The Role of Context in Abusive
Language Annotation and Detection
- Authors: Stefano Menini, Alessio Palmero Aprosio, Sara Tonelli
- Abstract summary: We investigate what happens when the hateful content of a message is judged also based on the context.
We first re-annotate part of a widely used dataset for abusive language detection in English in two conditions, i.e. with and without context.
- Score: 2.793095554369281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The datasets most widely used for abusive language detection contain lists of
messages, usually tweets, that have been manually judged as abusive or not by
one or more annotators, with the annotation performed at message level. In this
paper, we investigate what happens when the hateful content of a message is
judged also based on the context, given that messages are often ambiguous and
need to be interpreted in the context of occurrence. We first re-annotate part
of a widely used dataset for abusive language detection in English in two
conditions, i.e. with and without context. Then, we compare the performance of
three classification algorithms obtained on these two types of dataset, arguing
that a context-aware classification is more challenging but also more similar
to a real application scenario.
Related papers
- Evaluating Text Classification Robustness to Part-of-Speech Adversarial Examples [0.6445605125467574]
Adversarial examples are inputs that are designed to trick the decision making process, and are intended to be imperceptible to humans.
For text-based classification systems, changes to the input, a string of text, are always perceptible.
To improve the quality of text-based adversarial examples, we need to know what elements of the input text are worth focusing on.
arXiv Detail & Related papers (2024-08-15T18:33:54Z) - Improving Long Context Document-Level Machine Translation [51.359400776242786]
Document-level context for neural machine translation (NMT) is crucial to improve translation consistency and cohesion.
Many works have been published on the topic of document-level NMT, but most restrict the system to just local context.
We propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption.
arXiv Detail & Related papers (2023-06-08T13:28:48Z) - Enriching Abusive Language Detection with Community Context [0.3708656266586145]
Use of pejorative expressions can be benign or actively empowering.
Models for abuse detection misclassify these expressions as derogatory, inadvertently censor productive conversations held by marginalized groups.
Our paper highlights how community context can improve classification outcomes in abusive language detection.
arXiv Detail & Related papers (2022-06-16T20:54:02Z) - UCPhrase: Unsupervised Context-aware Quality Phrase Tagging [63.86606855524567]
UCPhrase is a novel unsupervised context-aware quality phrase tagger.
We induce high-quality phrase spans as silver labels from consistently co-occurring word sequences.
We show that our design is superior to state-of-the-art pre-trained, unsupervised, and distantly supervised methods.
arXiv Detail & Related papers (2021-05-28T19:44:24Z) - Abusive Language Detection in Heterogeneous Contexts: Dataset Collection
and the Role of Supervised Attention [9.597481034467915]
Abusive language is a massive problem in online social platforms.
We provide an annotated dataset of abusive language in over 11,000 comments from YouTube.
We propose an algorithm that uses a supervised attention mechanism to detect and categorize abusive content.
arXiv Detail & Related papers (2021-05-24T06:50:19Z) - Measuring and Increasing Context Usage in Context-Aware Machine
Translation [64.5726087590283]
We introduce a new metric, conditional cross-mutual information, to quantify the usage of context by machine translation models.
We then introduce a new, simple training method, context-aware word dropout, to increase the usage of context by context-aware models.
arXiv Detail & Related papers (2021-05-07T19:55:35Z) - A Token-level Reference-free Hallucination Detection Benchmark for
Free-form Text Generation [50.55448707570669]
We propose a novel token-level, reference-free hallucination detection task and an associated annotated dataset named HaDes.
To create this dataset, we first perturb a large number of text segments extracted from English language Wikipedia, and then verify these with crowd-sourced annotations.
arXiv Detail & Related papers (2021-04-18T04:09:48Z) - Wisdom of the Contexts: Active Ensemble Learning for Contextual Anomaly
Detection [7.87320844079302]
In contextual anomaly detection (CAD), an object is only considered anomalous within a specific context.
We propose a novel approach, called WisCon, that automatically creates contexts from the feature set.
Our method constructs an ensemble of multiple contexts, with varying importance scores, based on the assumption that not all useful contexts are equally so.
arXiv Detail & Related papers (2021-01-27T17:34:13Z) - Improving Machine Reading Comprehension with Contextualized Commonsense
Knowledge [62.46091695615262]
We aim to extract commonsense knowledge to improve machine reading comprehension.
We propose to represent relations implicitly by situating structured knowledge in a context.
We employ a teacher-student paradigm to inject multiple types of contextualized knowledge into a student machine reader.
arXiv Detail & Related papers (2020-09-12T17:20:01Z) - Don't Judge an Object by Its Context: Learning to Overcome Contextual
Bias [113.44471186752018]
Existing models often leverage co-occurrences between objects and their context to improve recognition accuracy.
This work focuses on addressing such contextual biases to improve the robustness of the learnt feature representations.
arXiv Detail & Related papers (2020-01-09T18:31:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.