Related papers: Exploring ChatGPT for Toxicity Detection in GitHub

Exploring ChatGPT for Toxicity Detection in GitHub

URL: http://arxiv.org/abs/2312.13105v1
Date: Wed, 20 Dec 2023 15:23:00 GMT
Title: Exploring ChatGPT for Toxicity Detection in GitHub
Authors: Shyamal Mishra, Preetha Chatterjee
Abstract summary: The prevalence of negative discourse, often manifested as toxic comments, poses significant challenges to developer well-being and productivity. To identify such negativity in project communications, automated toxicity detection models are necessary. To train these models effectively, we need large software engineering-specific toxicity datasets.
Score: 5.003898791753481
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fostering a collaborative and inclusive environment is crucial for the sustained progress of open source development. However, the prevalence of negative discourse, often manifested as toxic comments, poses significant challenges to developer well-being and productivity. To identify such negativity in project communications, especially within large projects, automated toxicity detection models are necessary. To train these models effectively, we need large software engineering-specific toxicity datasets. However, such datasets are limited in availability and often exhibit imbalance (e.g., only 6 in 1000 GitHub issues are toxic), posing challenges for training effective toxicity detection models. To address this problem, we explore a zero-shot LLM (ChatGPT) that is pre-trained on massive datasets but without being fine-tuned specifically for the task of detecting toxicity in software-related text. Our preliminary evaluation indicates that ChatGPT shows promise in detecting toxicity in GitHub, and warrants further investigation. We experimented with various prompts, including those designed for justifying model outputs, thereby enhancing model interpretability and paving the way for potential integration of ChatGPT-enabled toxicity detection into developer communication channels.

Related papers

"Silent Is Not Actually Silent": An Investigation of Toxicity on Bug Report Discussion [0.0]
This study explores toxicity in GitHub bug reports through a qualitative analysis of 203 bug threads, including 81 toxic ones. Our findings reveal that toxicity frequently arises from misaligned perceptions of bug severity and priority, unresolved frustrations with tools, and lapses in professional communication. Our preliminary findings offer actionable recommendations to improve bug resolution by mitigating toxicity.
arXiv Detail & Related papers (2025-03-13T05:39:29Z)
The Landscape of Toxicity: An Empirical Investigation of Toxicity on GitHub [3.0586855806896054]
profanity is the most frequent toxicity on GitHub, followed by trolling and insults. Corporate-sponsored projects are less toxic, but gaming projects are seven times more toxic than non-gaming ones. OSS contributors who have authored toxic comments in the past are significantly more likely to repeat such behavior.
arXiv Detail & Related papers (2025-02-12T09:24:59Z)
Analyzing Toxicity in Open Source Software Communications Using Psycholinguistics and Moral Foundations Theory [5.03553492616371]
This paper investigates a machine learning-based approach for the automatic detection of toxic communications in Open Source Software (OSS) We leverage psycholinguistic lexicons, and Moral Foundations Theory to analyze toxicity in two types of OSS communication channels; issue comments and code reviews. Using moral values as features is more effective than linguistic cues, resulting in 67.50% F1-measure in identifying toxic instances in code review data and 64.83% in issue comments.
arXiv Detail & Related papers (2024-12-17T17:52:00Z)
Comprehensive Assessment of Toxicity in ChatGPT [49.71090497696024]
We evaluate the toxicity in ChatGPT by utilizing instruction-tuning datasets. prompts in creative writing tasks can be 2x more likely to elicit toxic responses. Certain deliberately toxic prompts, designed in earlier studies, no longer yield harmful responses.
arXiv Detail & Related papers (2023-11-03T14:37:53Z)
ToxicChat: Unveiling Hidden Challenges of Toxicity Detection in Real-World User-AI Conversation [43.356758428820626]
We introduce ToxicChat, a novel benchmark based on real user queries from an open-source chatbots. Our systematic evaluation of models trained on existing toxicity datasets has shown their shortcomings when applied to this unique domain of ToxicChat. In the future, ToxicChat can be a valuable resource to drive further advancements toward building a safe and healthy environment for user-AI interactions.
arXiv Detail & Related papers (2023-10-26T13:35:41Z)
Exploring Moral Principles Exhibited in OSS: A Case Study on GitHub Heated Issues [5.659436621527968]
We analyze toxic communications in GitHub issue threads to identify and understand five types of moral principles exhibited in text. Preliminary findings suggest a possible link between moral principles and toxic comments in OSS communications.
arXiv Detail & Related papers (2023-07-28T15:42:10Z)
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing [139.77117915309023]
CRITIC allows large language models to validate and amend their own outputs in a manner similar to human interaction with tools. Comprehensive evaluations involving free-form question answering, mathematical program synthesis, and toxicity reduction demonstrate that CRITIC consistently enhances the performance of LLMs.
arXiv Detail & Related papers (2023-05-19T15:19:44Z)
Does Synthetic Data Generation of LLMs Help Clinical Text Mining? [51.205078179427645]
We investigate the potential of OpenAI's ChatGPT to aid in clinical text mining. We propose a new training paradigm that involves generating a vast quantity of high-quality synthetic data. Our method has resulted in significant improvements in the performance of downstream tasks.
arXiv Detail & Related papers (2023-03-08T03:56:31Z)
Automated Identification of Toxic Code Reviews: How Far Can We Go? [7.655225472610752]
ToxiCR is a supervised learning-based toxicity identification tool for code review interactions. ToxiCR significantly outperforms existing toxicity detectors on our dataset.
arXiv Detail & Related papers (2022-02-26T04:27:39Z)
Toxicity Detection can be Sensitive to the Conversational Context [64.28043776806213]
We construct and publicly release a dataset of 10,000 posts with two kinds of toxicity labels. We introduce a new task, context sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context is also considered.
arXiv Detail & Related papers (2021-11-19T13:57:26Z)
WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for Detecting Toxic Spans [2.4737119633827174]
In recent years, the widespread use of social media has led to an increase in the generation of toxic and offensive content on online platforms. Social media platforms have worked on developing automatic detection methods and employing human moderators to cope with this deluge of offensive content.
arXiv Detail & Related papers (2021-04-09T22:52:26Z)
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models [93.151822563361]
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment. We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration.
arXiv Detail & Related papers (2020-09-24T03:17:19Z)
RECAST: Interactive Auditing of Automatic Toxicity Detection Models [39.621867230707814]
We present our ongoing work, RECAST, an interactive tool for examining toxicity detection models by visualizing explanations for predictions and providing alternative wordings for detected toxic speech.
arXiv Detail & Related papers (2020-01-07T00:17:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.