Related papers: CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity understanding and detection

CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity understanding and detection

URL: http://arxiv.org/abs/2106.06213v1
Date: Fri, 11 Jun 2021 07:42:12 GMT
Title: CONDA: a CONtextual Dual-Annotated dataset for in-game toxicity understanding and detection
Authors: Henry Weld, Guanghao Huang, Jean Lee, Tongshu Zhang, Kunze Wang, Xinghong Guo, Siqu Long, Josiah Poon, Soyeon Caren Han
Abstract summary: CONDA is a new dataset for in-game toxic language detection enabling joint intent classification and slot filling analysis. The dataset consists of 45K utterances from 12K conversations from the chat logs of 1.9K completed Dota 2 matches. A thorough in-game toxicity analysis provides comprehensive understanding of context at utterance, token, and dual levels.
Score: 1.6085428542036968
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditional toxicity detection models have focused on the single utterance level without deeper understanding of context. We introduce CONDA, a new dataset for in-game toxic language detection enabling joint intent classification and slot filling analysis, which is the core task of Natural Language Understanding (NLU). The dataset consists of 45K utterances from 12K conversations from the chat logs of 1.9K completed Dota 2 matches. We propose a robust dual semantic-level toxicity framework, which handles utterance and token-level patterns, and rich contextual chatting history. Accompanying the dataset is a thorough in-game toxicity analysis, which provides comprehensive understanding of context at utterance, token, and dual levels. Inspired by NLU, we also apply its metrics to the toxicity detection tasks for assessing toxicity and game-specific aspects. We evaluate strong NLU models on CONDA, providing fine-grained results for different intent classes and slot classes. Furthermore, we examine the coverage of toxicity nature in our dataset by comparing it with other toxicity datasets.

Related papers

ToxicTone: A Mandarin Audio Dataset Annotated for Toxicity and Toxic Utterance Tonality [35.517662288248225]
ToxicTone is the largest public dataset of its kind.<n>Our data is sourced from diverse real-world audio and organized into 13 topical categories.<n>We propose a multimodal detection framework that integrates acoustic, linguistic, and emotional features.
arXiv Detail & Related papers (2025-05-21T17:25:27Z)
Understanding and Mitigating Toxicity in Image-Text Pretraining Datasets: A Case Study on LLaVA [0.0]
This dataset removes 7,531 of toxic image-text pairs in the LLaVA pre-training dataset.<n>We offer guidelines for implementing robust toxicity detection pipelines.
arXiv Detail & Related papers (2025-05-09T18:01:50Z)
Aligned Probing: Relating Toxic Behavior and Model Internals [66.49887503194101]
We introduce aligned probing, a novel interpretability framework that aligns the behavior of language models (LMs) Using this framework, we examine over 20 OLMo, Llama, and Mistral models, bridging behavioral and internal perspectives for toxicity for the first time. Our results show that LMs strongly encode information about the toxicity level of inputs and subsequent outputs, particularly in lower layers.
arXiv Detail & Related papers (2025-03-17T17:23:50Z)
SafeSpeech: A Comprehensive and Interactive Tool for Analysing Sexist and Abusive Language in Conversations [7.4815142964548205]
SafeSpeech is a comprehensive platform for toxic content detection and analysis. It bridges message-level and conversation-level insights. The platform integrates fine-tuned classifiers and large language models. Evaluations on benchmark datasets, including EDOS, OffensEval, and HatEval, demonstrate the reproduction of state-of-the-art performance.
arXiv Detail & Related papers (2025-03-09T09:31:17Z)
Just KIDDIN: Knowledge Infusion and Distillation for Detection of INdecent Memes [8.337745035712311]
We propose a novel framework that integrates Knowledge Distillation (KD) from Large Visual Language Models (LVLMs) and knowledge infusion to enhance the performance of toxicity detection in hateful memes. Our approach extracts sub-knowledge graphs from ConceptNet, a large-scale commonsense Knowledge Graph (KG) to be infused within a compact VLM framework. Experimental results from our study on two hate speech benchmark datasets demonstrate superior performance over the state-of-the-art baselines.
arXiv Detail & Related papers (2024-11-19T02:39:28Z)
Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks [18.44630180661091]
Existing datasets lack fine-grained annotation of toxic types and expressions. It is crucial to introduce lexical knowledge to detect the toxicity of posts. In this paper, we facilitate the fine-grained detection of Chinese toxic language.
arXiv Detail & Related papers (2023-05-08T03:50:38Z)
Contextual information integration for stance detection via cross-attention [59.662413798388485]
Stance detection deals with identifying an author's stance towards a target. Most existing stance detection models are limited because they do not consider relevant contextual information. We propose an approach to integrate contextual information as text.
arXiv Detail & Related papers (2022-11-03T15:04:29Z)
Toxicity Detection with Generative Prompt-based Inference [3.9741109244650823]
It is a long-known risk that language models (LMs), once trained on corpus containing undesirable content, have the power to manifest biases and toxicity. In this work, we explore the generative variant of zero-shot prompt-based toxicity detection with comprehensive trials on prompt engineering.
arXiv Detail & Related papers (2022-05-24T22:44:43Z)
Revisiting Contextual Toxicity Detection in Conversations [28.465019968374413]
We show that toxicity labelling by humans is in general influenced by the conversational structure, polarity and topic of the context. We propose to bring these findings into computational detection models by introducing (a) neural architectures for contextual toxicity detection. We have also demonstrated that such models can benefit from synthetic data, especially in the social media domain.
arXiv Detail & Related papers (2021-11-24T11:50:37Z)
Toxicity Detection can be Sensitive to the Conversational Context [64.28043776806213]
We construct and publicly release a dataset of 10,000 posts with two kinds of toxicity labels. We introduce a new task, context sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context is also considered.
arXiv Detail & Related papers (2021-11-19T13:57:26Z)
Mitigating Biases in Toxic Language Detection through Invariant Rationalization [70.36701068616367]
biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection. We propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns. Our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.
arXiv Detail & Related papers (2021-06-14T08:49:52Z)
Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company's Reputation [64.22895450493729]
A calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities. We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness.
arXiv Detail & Related papers (2021-03-09T10:50:30Z)
Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language. We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection. Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z)
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models [93.151822563361]
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment. We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration.
arXiv Detail & Related papers (2020-09-24T03:17:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.