Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning
- URL: http://arxiv.org/abs/2505.17068v1
- Date: Mon, 19 May 2025 11:53:37 GMT
- Title: Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning
- Authors: Jorge Paz-Ruza, Amparo Alonso-Betanzos, Bertha Guijarro-BerdiƱas, Carlos Eiras-Franco,
- Abstract summary: We propose the alternative of combatting user toxicity predictively, anticipating where a user could interact toxically in health-related online discussions.<n>Applying a Collaborative Filtering-based Machine Learning methodology, we predict the toxicity in COVID-related conversations between any user and subcommunity of Reddit, surpassing 80% predictive performance in relevant metrics.
- Score: 2.9748898344267785
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In health-related topics, user toxicity in online discussions frequently becomes a source of social conflict or promotion of dangerous, unscientific behaviour; common approaches for battling it include different forms of detection, flagging and/or removal of existing toxic comments, which is often counterproductive for platforms and users alike. In this work, we propose the alternative of combatting user toxicity predictively, anticipating where a user could interact toxically in health-related online discussions. Applying a Collaborative Filtering-based Machine Learning methodology, we predict the toxicity in COVID-related conversations between any user and subcommunity of Reddit, surpassing 80% predictive performance in relevant metrics, and allowing us to prevent the pairing of conflicting users and subcommunities.
Related papers
- Not-in-Perspective: Towards Shielding Google's Perspective API Against Adversarial Negation Attacks [1.675857332621569]
cyberbullying has escalated the need for effective ways to monitor and moderate online interactions.<n>Existing solutions of automated toxicity detection systems, are based on a machine or deep learning algorithms.<n>We present a set of formal reasoning-based methodologies that wrap around existing machine learning toxicity detection systems.
arXiv Detail & Related papers (2026-02-10T02:27:28Z) - Taming Toxic Talk: Using chatbots to intervene with users posting toxic comments [3.1918086432069663]
We explore the impact of rehabilitative conversations with generative AI chatbots on users who share toxic content online.<n>We conducted a large-scale field experiment with seven Reddit communities.<n>We did not observe a significant change in toxic behavior in the following month compared to a control group.
arXiv Detail & Related papers (2026-01-27T22:39:23Z) - Toxicity in Online Platforms and AI Systems: A Survey of Needs, Challenges, Mitigations, and Future Directions [12.73085307172367]
The evolution of digital communication systems and the designs of online platforms have inadvertently facilitated the subconscious propagation of toxic behavior.<n>This survey attempts to generate a comprehensive taxonomy of toxicity from various perspectives.<n>It presents a holistic approach to explain the toxicity by understanding the context and environment that society is facing in the Artificial Intelligence era.
arXiv Detail & Related papers (2025-09-29T21:55:23Z) - Identifying Constructive Conflict in Online Discussions through Controversial yet Toxicity Resilient Posts [41.130462443875736]
We operationalize controversiality to identify challenging dialogues and toxicity resilience to capture respectful conversations.<n>We also find that political posts are often controversial and tend to attract more toxic responses.<n>These findings suggest the potential for framing the tone of posts to encourage constructive political discussions.
arXiv Detail & Related papers (2025-09-22T18:30:41Z) - Comprehensive Assessment of Toxicity in ChatGPT [49.71090497696024]
We evaluate the toxicity in ChatGPT by utilizing instruction-tuning datasets.
prompts in creative writing tasks can be 2x more likely to elicit toxic responses.
Certain deliberately toxic prompts, designed in earlier studies, no longer yield harmful responses.
arXiv Detail & Related papers (2023-11-03T14:37:53Z) - ToxicChat: Unveiling Hidden Challenges of Toxicity Detection in
Real-World User-AI Conversation [43.356758428820626]
We introduce ToxicChat, a novel benchmark based on real user queries from an open-source chatbots.
Our systematic evaluation of models trained on existing toxicity datasets has shown their shortcomings when applied to this unique domain of ToxicChat.
In the future, ToxicChat can be a valuable resource to drive further advancements toward building a safe and healthy environment for user-AI interactions.
arXiv Detail & Related papers (2023-10-26T13:35:41Z) - Forecasting User Interests Through Topic Tag Predictions in Online
Health Communities [16.088586964818703]
This paper proposes an innovative approach to suggesting reliable information to participants in online communities.
We pose the problem of predicting topic tags that describe the future information needs of users based on their profiles.
The result is a variant of the collaborative information filtering or recommendation system tailored to the needs of users of online health communities.
arXiv Detail & Related papers (2022-11-05T00:09:45Z) - Twitter Users' Behavioral Response to Toxic Replies [1.2387676601792899]
We studied the impact of toxicity on users' online behavior on Twitter.
We found that toxicity victims show a combination of the following behavioral reactions: avoidance, revenge, countermeasures, and negotiation.
Our results can assist further studies in developing more effective detection and intervention methods for reducing the negative consequences of toxicity on social media.
arXiv Detail & Related papers (2022-10-24T17:36:58Z) - Handling Bias in Toxic Speech Detection: A Survey [26.176340438312376]
We look at proposed methods for evaluating and mitigating bias in toxic speech detection.
Case study introduces the concept of bias shift due to knowledge-based bias mitigation.
Survey concludes with an overview of the critical challenges, research gaps, and future directions.
arXiv Detail & Related papers (2022-01-26T10:38:36Z) - Toxicity Detection can be Sensitive to the Conversational Context [64.28043776806213]
We construct and publicly release a dataset of 10,000 posts with two kinds of toxicity labels.
We introduce a new task, context sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context is also considered.
arXiv Detail & Related papers (2021-11-19T13:57:26Z) - Mitigating Biases in Toxic Language Detection through Invariant
Rationalization [70.36701068616367]
biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection.
We propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns.
Our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.
arXiv Detail & Related papers (2021-06-14T08:49:52Z) - RECAST: Enabling User Recourse and Interpretability of Toxicity
Detection Models with Interactive Visualization [16.35961310670002]
We present our work, RECAST, an interactive, open-sourced web tool for visualizing toxic models' predictions.
We found that RECAST was highly effective at helping users reduce toxicity as detected through the model.
This opens a discussion for how toxicity detection models work and should work, and their effect on the future of online discourse.
arXiv Detail & Related papers (2021-02-08T18:37:50Z) - RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
Models [93.151822563361]
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment.
We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration.
arXiv Detail & Related papers (2020-09-24T03:17:19Z) - Assessing the Severity of Health States based on Social Media Posts [62.52087340582502]
We propose a multiview learning framework that models both the textual content as well as contextual-information to assess the severity of the user's health state.
The diverse NLU views demonstrate its effectiveness on both the tasks and as well as on the individual disease to assess a user's health.
arXiv Detail & Related papers (2020-09-21T03:45:14Z) - Using Sentiment Information for Preemptive Detection of Toxic Comments
in Online Conversations [0.0]
Some authors have tried to predict if a conversation will derail into toxicity using the features of the first few messages.
We show how the sentiments expressed in the first messages of a conversation can help predict upcoming toxicity.
arXiv Detail & Related papers (2020-06-17T20:41:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.