Related papers: Cyberbullying Detection with Fairness Constraints

Cyberbullying Detection with Fairness Constraints

URL: http://arxiv.org/abs/2005.06625v2
Date: Tue, 29 Sep 2020 21:54:00 GMT
Title: Cyberbullying Detection with Fairness Constraints
Authors: Oguzhan Gencoglu
Abstract summary: We propose a model training scheme that can employ fairness constraints and validate our approach with different datasets. We believe our work contributes to the pursuit of unbiased, transparent, and ethical machine learning solutions for cyber-social health.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cyberbullying is a widespread adverse phenomenon among online social interactions in today's digital society. While numerous computational studies focus on enhancing the cyberbullying detection performance of machine learning algorithms, proposed models tend to carry and reinforce unintended social biases. In this study, we try to answer the research question of "Can we mitigate the unintended bias of cyberbullying detection models by guiding the model training with fairness constraints?". For this purpose, we propose a model training scheme that can employ fairness constraints and validate our approach with different datasets. We demonstrate that various types of unintended biases can be successfully mitigated without impairing the model quality. We believe our work contributes to the pursuit of unbiased, transparent, and ethical machine learning solutions for cyber-social health.

Related papers

Promoting Security and Trust on Social Networks: Explainable Cyberbullying Detection Using Large Language Models in a Stream-Based Machine Learning Framework [5.635300481123079]
Social media platforms have given rise to negative behaviors in the online community, the so-called cyberbullying.<n>We propose an innovative and real-time solution for cyberbullying detection using stream-based Machine Learning (ML) models and Large Language Models (LLMS)<n>Results on experimental data report promising performance close to 90 % in all evaluation metrics and surpassing those obtained by competing works in the literature.
arXiv Detail & Related papers (2025-04-07T15:57:37Z)
Verification of Machine Unlearning is Fragile [48.71651033308842]
We introduce two novel adversarial unlearning processes capable of circumventing both types of verification strategies. This study highlights the vulnerabilities and limitations in machine unlearning verification, paving the way for further research into the safety of machine unlearning.
arXiv Detail & Related papers (2024-08-01T21:37:10Z)
The Fairness Stitch: Unveiling the Potential of Model Stitching in Neural Network De-Biasing [0.043512163406552]
This study introduces a novel method called "The Fairness Stitch" to enhance fairness in deep learning models. We conduct a comprehensive evaluation of two well-known datasets, CelebA and UTKFace. Our findings reveal a notable improvement in achieving a balanced trade-off between fairness and performance.
arXiv Detail & Related papers (2023-11-06T21:14:37Z)
Designing an attack-defense game: how to increase robustness of financial transaction models via a competition [69.08339915577206]
Given the escalating risks of malicious attacks in the finance sector, understanding adversarial strategies and robust defense mechanisms for machine learning models is critical. We aim to investigate the current state and dynamics of adversarial attacks and defenses for neural network models that use sequential financial data as the input. We have designed a competition that allows realistic and detailed investigation of problems in modern financial transaction data. The participants compete directly against each other, so possible attacks and defenses are examined in close-to-real-life conditions.
arXiv Detail & Related papers (2023-08-22T12:53:09Z)
Session-based Cyberbullying Detection in Social Media: A Survey [16.39344929765961]
We define the Session-based Cyberbullying Detection framework that encapsulates the different steps and challenges of the problem. Our review leads us to propose evidence-based criteria for a set of best practices to create session-based cyberbullying datasets.
arXiv Detail & Related papers (2022-07-14T18:56:54Z)
A Framework for Understanding Model Extraction Attack and Defense [48.421636548746704]
We study tradeoffs between model utility from a benign user's view and privacy from an adversary's view. We develop new metrics to quantify such tradeoffs, analyze their theoretical properties, and develop an optimization problem to understand the optimal adversarial attack and defense strategies.
arXiv Detail & Related papers (2022-06-23T05:24:52Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)
On the Transferability of Adversarial Attacksagainst Neural Text Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models. We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models. We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z)
Fairness-Aware Online Personalization [16.320648868892526]
We present a study of fairness in online personalization settings involving the ranking of individuals. We first demonstrate that online personalization can cause the model to learn to act in an unfair manner if the user is biased in his/her responses. We then formulate the problem of learning personalized models under fairness constraints and present a regularization based approach for mitigating biases in machine learning.
arXiv Detail & Related papers (2020-07-30T07:16:17Z)
Aggressive, Repetitive, Intentional, Visible, and Imbalanced: Refining Representations for Cyberbullying Classification [4.945634077636197]
We study the nuanced problem of cyberbullying using five explicit factors to represent its social and linguistic aspects. These results demonstrate the importance of representing and modeling cyberbullying as a social phenomenon.
arXiv Detail & Related papers (2020-04-04T00:35:16Z)
FairALM: Augmented Lagrangian Method for Training Fair Models with Little Regret [42.66567001275493]
It is now accepted that because of biases in the datasets we present to the models, a fairness-oblivious training will lead to unfair models. Here, we study mechanisms that impose fairness concurrently while training the model.
arXiv Detail & Related papers (2020-04-03T03:18:53Z)
Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data. Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model. Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.