Related papers: Automated Identification of Toxic Code Reviews: How Far Can We Go?

Automated Identification of Toxic Code Reviews: How Far Can We Go?

URL: http://arxiv.org/abs/2202.13056v1
Date: Sat, 26 Feb 2022 04:27:39 GMT
Title: Automated Identification of Toxic Code Reviews: How Far Can We Go?
Authors: Jaydeb Sarker, Asif Kamal Turzo, Ming Dong, Amiangshu Bosu
Abstract summary: ToxiCR is a supervised learning-based toxicity identification tool for code review interactions. ToxiCR significantly outperforms existing toxicity detectors on our dataset.
Score: 7.655225472610752
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Toxic conversations during software development interactions may have serious repercussions on a Free and Open Source Software (FOSS) development project. For example, victims of toxic conversations may become afraid to express themselves, therefore get demotivated, and may eventually leave the project. Automated filtering of toxic conversations may help a FOSS community to maintain healthy interactions among its members. However, off-the-shelf toxicity detectors perform poorly on Software Engineering (SE) dataset, such as one curated from code review comments. To encounter this challenge, we present ToxiCR, a supervised learning-based toxicity identification tool for code review interactions. ToxiCR includes a choice to select one of the ten supervised learning algorithms, an option to select text vectorization techniques, five mandatory and three optional SE domain specific processing steps, and a large scale labeled dataset of 19,571 code review comments. With our rigorous evaluation of the models with various combinations of preprocessing steps and vectorization techniques, we have identified the best combination for our dataset that boosts 95.8% accuracy and 88.9% F1 score. ToxiCR significantly outperforms existing toxicity detectors on our dataset. We have released our dataset, pretrained models, evaluation results, and source code publicly available at: https://github.com/WSU-SEAL/ToxiCR.

Related papers

How Do Code Smells Affect Skill Growth in Scratch Novice Programmers? [3.8506666685467343]
The study will deliver the first large-scale, fine-grained map linking specific CT competencies to concrete design flaws and antipatterns.<n>By clarifying how programming habits influence early skill acquisition, the work advances both computing-education theory and practical tooling for sustainable software maintenance and evolution.
arXiv Detail & Related papers (2025-07-23T08:30:06Z)
Analyzing Toxicity in Open Source Software Communications Using Psycholinguistics and Moral Foundations Theory [5.03553492616371]
This paper investigates a machine learning-based approach for the automatic detection of toxic communications in Open Source Software (OSS) We leverage psycholinguistic lexicons, and Moral Foundations Theory to analyze toxicity in two types of OSS communication channels; issue comments and code reviews. Using moral values as features is more effective than linguistic cues, resulting in 67.50% F1-measure in identifying toxic instances in code review data and 64.83% in issue comments.
arXiv Detail & Related papers (2024-12-17T17:52:00Z)
Towards Realistic Evaluation of Commit Message Generation by Matching Online and Offline Settings [77.20838441870151]
Commit message generation is a crucial task in software engineering that is challenging to evaluate correctly. We use an online metric - the number of edits users introduce before committing the generated messages to the VCS - to select metrics for offline experiments. Our results indicate that edit distance exhibits the highest correlation, whereas commonly used similarity metrics such as BLEU and METEOR demonstrate low correlation.
arXiv Detail & Related papers (2024-10-15T20:32:07Z)
GPT-DETOX: An In-Context Learning-Based Paraphraser for Text Detoxification [1.8295720742100332]
We propose GPT-DETOX as a framework for prompt-based in-context learning for text detoxification using GPT-3.5 Turbo. To generate few-shot prompts, we propose two methods: word-matching example selection (WMES) and context-matching example selection (CMES) We take into account ensemble in-context learning (EICL) where the ensemble is shaped by base prompts from zero-shot and all few-shot settings.
arXiv Detail & Related papers (2024-04-03T20:35:36Z)
Exploring ChatGPT for Toxicity Detection in GitHub [5.003898791753481]
The prevalence of negative discourse, often manifested as toxic comments, poses significant challenges to developer well-being and productivity. To identify such negativity in project communications, automated toxicity detection models are necessary. To train these models effectively, we need large software engineering-specific toxicity datasets.
arXiv Detail & Related papers (2023-12-20T15:23:00Z)
ToxiSpanSE: An Explainable Toxicity Detection in Code Review Comments [4.949881799107062]
ToxiSpanSE is the first tool to detect toxic spans in the Software Engineering (SE) domain. Our model achieved the best score with 0.88 $F1$, 0.87 precision, and 0.93 recall for toxic class tokens.
arXiv Detail & Related papers (2023-07-07T04:55:11Z)
EnDex: Evaluation of Dialogue Engagingness at Scale [30.15445159524315]
We propose EnDex, the first human-reaction based model to evaluate dialogue engagingness. We will release code, off-the-shelf EnDex model, and a large-scale dataset upon paper publication.
arXiv Detail & Related papers (2022-10-22T06:09:43Z)
BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot Detection [63.447493500066045]
This work proposes a data driven learning model for the synthesis of keystroke biometric data. The proposed method is compared with two statistical approaches based on Universal and User-dependent models. Our experimental framework considers a dataset with 136 million keystroke events from 168 thousand subjects.
arXiv Detail & Related papers (2022-07-27T09:26:15Z)
PoisonedEncoder: Poisoning the Unlabeled Pre-training Data in Contrastive Learning [69.70602220716718]
We propose PoisonedEncoder, a data poisoning attack to contrastive learning. In particular, an attacker injects carefully crafted poisoning inputs into the unlabeled pre-training data. We evaluate five defenses against PoisonedEncoder, including one pre-processing, three in-processing, and one post-processing defenses.
arXiv Detail & Related papers (2022-05-13T00:15:44Z)
Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV) We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples. Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z)
Lone Pine at SemEval-2021 Task 5: Fine-Grained Detection of Hate Speech Using BERToxic [2.4815579733050153]
This paper describes our approach to the Toxic Spans Detection problem. We propose BERToxic, a system that fine-tunes a pre-trained BERT model to locate toxic text spans in a given text. Our system significantly outperformed the provided baseline and achieved an F1-score of 0.683, placing Lone Pine in the 17th place out of 91 teams in the competition.
arXiv Detail & Related papers (2021-04-08T04:46:14Z)
Exploiting Unsupervised Data for Emotion Recognition in Conversations [76.01690906995286]
Emotion Recognition in Conversations (ERC) aims to predict the emotional state of speakers in conversations. The available supervised data for the ERC task is limited. We propose a novel approach to leverage unsupervised conversation data.
arXiv Detail & Related papers (2020-10-02T13:28:47Z)
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models [93.151822563361]
Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment. We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration.
arXiv Detail & Related papers (2020-09-24T03:17:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.