Muted: Multilingual Targeted Offensive Speech Identification and
Visualization
- URL: http://arxiv.org/abs/2312.11344v1
- Date: Mon, 18 Dec 2023 16:50:27 GMT
- Title: Muted: Multilingual Targeted Offensive Speech Identification and
Visualization
- Authors: Christoph Tillmann, Aashka Trivedi, Sara Rosenthal, Santosh Borse,
Rong Zhang, Avirup Sil, Bishwaranjan Bhattacharjee
- Abstract summary: Muted is a system to identify multilingual HAP content by displaying offensive arguments and their targets using heat maps to indicate their intensity.
We present the model's performance on identifying offensive spans and their targets in existing datasets and present new annotations on German text.
- Score: 15.656203119337436
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Offensive language such as hate, abuse, and profanity (HAP) occurs in various
content on the web. While previous work has mostly dealt with sentence level
annotations, there have been a few recent attempts to identify offensive spans
as well. We build upon this work and introduce Muted, a system to identify
multilingual HAP content by displaying offensive arguments and their targets
using heat maps to indicate their intensity. Muted can leverage any
transformer-based HAP-classification model and its attention mechanism
out-of-the-box to identify toxic spans, without further fine-tuning. In
addition, we use the spaCy library to identify the specific targets and
arguments for the words predicted by the attention heatmaps. We present the
model's performance on identifying offensive spans and their targets in
existing datasets and present new annotations on German text. Finally, we
demonstrate our proposed visualization tool on multilingual inputs.
Related papers
- ToxiCloakCN: Evaluating Robustness of Offensive Language Detection in Chinese with Cloaking Perturbations [6.360597788845826]
This study examines the limitations of state-of-the-art large language models (LLMs) in identifying offensive content within systematically perturbed data.
Our work highlights the urgent need for more advanced techniques in offensive language detection to combat the evolving tactics used to evade detection mechanisms.
arXiv Detail & Related papers (2024-06-18T02:44:56Z) - Target Span Detection for Implicit Harmful Content [18.84674403712032]
We focus on identifying implied targets of hate speech, essential for recognizing subtler hate speech and enhancing the detection of harmful content on digital platforms.
We collect and annotate target spans in three prominent implicit hate speech datasets: SBIC, DynaHate, and IHC.
Our experiments indicate that Implicit-Target-Span provides a challenging test bed for target span detection methods.
arXiv Detail & Related papers (2024-03-28T21:15:15Z) - Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot
Translation [79.96416609433724]
Zero-shot translation (ZST) aims to translate between unseen language pairs in training data.
The common practice to guide the zero-shot language mapping during inference is to deliberately insert the source and target language IDs.
Recent studies have shown that language IDs sometimes fail to navigate the ZST task, making them suffer from the off-target problem.
arXiv Detail & Related papers (2023-09-28T17:02:36Z) - Shapley Head Pruning: Identifying and Removing Interference in
Multilingual Transformers [54.4919139401528]
We show that it is possible to reduce interference by identifying and pruning language-specific parameters.
We show that removing identified attention heads from a fixed model improves performance for a target language on both sentence classification and structural prediction.
arXiv Detail & Related papers (2022-10-11T18:11:37Z) - A New Generation of Perspective API: Efficient Multilingual
Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw.
At the heart of the approach is a single multilingual token-free Charformer model.
We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z) - On Guiding Visual Attention with Language Specification [76.08326100891571]
We use high-level language specification as advice for constraining the classification evidence to task-relevant features, instead of distractors.
We show that supervising spatial attention in this way improves performance on classification tasks with biased and noisy data.
arXiv Detail & Related papers (2022-02-17T22:40:19Z) - MUDES: Multilingual Detection of Offensive Spans [3.284443134471233]
MUDES is a system to detect offensive spans in texts.
It features pre-trained models, a Python API for developers, and a user-friendly web-based interface.
arXiv Detail & Related papers (2021-02-18T23:19:00Z) - Leveraging Multilingual Transformers for Hate Speech Detection [11.306581296760864]
We leverage state of the art Transformer language models to identify hate speech in a multilingual setting.
With a pre-trained multilingual Transformer-based text encoder at the base, we are able to successfully identify and classify hate speech from multiple languages.
arXiv Detail & Related papers (2021-01-08T20:23:50Z) - Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text
Classification [52.69730591919885]
We present a semi-supervised adversarial training process that minimizes the maximal loss for label-preserving input perturbations.
We observe significant gains in effectiveness on document and intent classification for a diverse set of languages.
arXiv Detail & Related papers (2020-07-29T19:38:35Z) - On the Importance of Word Order Information in Cross-lingual Sequence
Labeling [80.65425412067464]
Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
arXiv Detail & Related papers (2020-01-30T03:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.