Defining, Understanding, and Detecting Online Toxicity: Challenges and Machine Learning Approaches
- URL: http://arxiv.org/abs/2509.14264v1
- Date: Sun, 14 Sep 2025 00:16:53 GMT
- Title: Defining, Understanding, and Detecting Online Toxicity: Challenges and Machine Learning Approaches
- Authors: Gautam Kishore Shahi, Tim A. Majchrzak,
- Abstract summary: This study represents the synthesis of 140 publications on different types of toxic content on digital platforms.<n>The dataset encompasses content in 32 languages, covering topics such as elections, spontaneous events, and crises.<n>We present the recommendations and guidelines for new research on online toxic consent and the use of content moderation for mitigation.
- Score: 4.1824815480811806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online toxic content has grown into a pervasive phenomenon, intensifying during times of crisis, elections, and social unrest. A significant amount of research has been focused on detecting or analyzing toxic content using machine-learning approaches. The proliferation of toxic content across digital platforms has spurred extensive research into automated detection mechanisms, primarily driven by advances in machine learning and natural language processing. Overall, the present study represents the synthesis of 140 publications on different types of toxic content on digital platforms. We present a comprehensive overview of the datasets used in previous studies focusing on definitions, data sources, challenges, and machine learning approaches employed in detecting online toxicity, such as hate speech, offensive language, and harmful discourse. The dataset encompasses content in 32 languages, covering topics such as elections, spontaneous events, and crises. We examine the possibility of using existing cross-platform data to improve the performance of classification models. We present the recommendations and guidelines for new research on online toxic consent and the use of content moderation for mitigation. Finally, we present some practical guidelines to mitigate toxic content from online platforms.
Related papers
- Not-in-Perspective: Towards Shielding Google's Perspective API Against Adversarial Negation Attacks [1.675857332621569]
cyberbullying has escalated the need for effective ways to monitor and moderate online interactions.<n>Existing solutions of automated toxicity detection systems, are based on a machine or deep learning algorithms.<n>We present a set of formal reasoning-based methodologies that wrap around existing machine learning toxicity detection systems.
arXiv Detail & Related papers (2026-02-10T02:27:28Z) - Unveiling Covert Toxicity in Multimodal Data via Toxicity Association Graphs: A Graph-Based Metric and Interpretable Detection Framework [58.01529356381494]
We propose a novel detection framework based on Toxicity Association Graphs (TAGs)<n>We introduce the first quantifiable metric for hidden toxicity, the Multimodal Toxicity Covertness (MTC)<n>Our approach enables precise identification of covert toxicity while preserving full interpretability of the decision-making process.
arXiv Detail & Related papers (2026-02-03T08:54:25Z) - Toxicity in Online Platforms and AI Systems: A Survey of Needs, Challenges, Mitigations, and Future Directions [12.73085307172367]
The evolution of digital communication systems and the designs of online platforms have inadvertently facilitated the subconscious propagation of toxic behavior.<n>This survey attempts to generate a comprehensive taxonomy of toxicity from various perspectives.<n>It presents a holistic approach to explain the toxicity by understanding the context and environment that society is facing in the Artificial Intelligence era.
arXiv Detail & Related papers (2025-09-29T21:55:23Z) - Understanding and Mitigating Toxicity in Image-Text Pretraining Datasets: A Case Study on LLaVA [0.0]
This dataset removes 7,531 of toxic image-text pairs in the LLaVA pre-training dataset.<n>We offer guidelines for implementing robust toxicity detection pipelines.
arXiv Detail & Related papers (2025-05-09T18:01:50Z) - Ontology Embedding: A Survey of Methods, Applications and Resources [54.3453925775069]
Onologies are widely used for representing domain knowledge and meta data.<n> logical reasoning that can directly support are quite limited in learning, approximation and prediction.<n>One straightforward solution is to integrate statistical analysis and machine learning.
arXiv Detail & Related papers (2024-06-16T14:49:19Z) - A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities
from the Perspective of Annotating Online Toxicity [15.23055494327071]
Toxicity is an increasingly common and severe issue in online spaces.
A rich line of machine learning research has focused on computationally detecting and mitigating online toxicity.
Recent research has pointed out the importance of accounting for the subjective nature of this task.
arXiv Detail & Related papers (2023-11-07T21:00:51Z) - Harnessing the Power of Text-image Contrastive Models for Automatic
Detection of Online Misinformation [50.46219766161111]
We develop a self-learning model to explore the constrastive learning in the domain of misinformation identification.
Our model shows the superior performance of non-matched image-text pair detection when the training data is insufficient.
arXiv Detail & Related papers (2023-04-19T02:53:59Z) - Countering Malicious Content Moderation Evasion in Online Social
Networks: Simulation and Detection of Word Camouflage [64.78260098263489]
Twisting and camouflaging keywords are among the most used techniques to evade platform content moderation systems.
This article contributes significantly to countering malicious information by developing multilingual tools to simulate and detect new methods of evasion of content.
arXiv Detail & Related papers (2022-12-27T16:08:49Z) - Panning for gold: Lessons learned from the platform-agnostic automated
detection of political content in textual data [48.7576911714538]
We discuss how these techniques can be used to detect political content across different platforms.
We compare the performance of three groups of detection techniques relying on dictionaries, supervised machine learning, or neural networks.
Our results show the limited impact of preprocessing on model performance, with the best results for less noisy data being achieved by neural network- and machine-learning-based models.
arXiv Detail & Related papers (2022-07-01T15:23:23Z) - Handling Bias in Toxic Speech Detection: A Survey [26.176340438312376]
We look at proposed methods for evaluating and mitigating bias in toxic speech detection.
Case study introduces the concept of bias shift due to knowledge-based bias mitigation.
Survey concludes with an overview of the critical challenges, research gaps, and future directions.
arXiv Detail & Related papers (2022-01-26T10:38:36Z) - RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
Models [93.151822563361]
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment.
We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration.
arXiv Detail & Related papers (2020-09-24T03:17:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.