Cyberbullying Detection with Fairness Constraints
- URL: http://arxiv.org/abs/2005.06625v2
- Date: Tue, 29 Sep 2020 21:54:00 GMT
- Title: Cyberbullying Detection with Fairness Constraints
- Authors: Oguzhan Gencoglu
- Abstract summary: We propose a model training scheme that can employ fairness constraints and validate our approach with different datasets.
We believe our work contributes to the pursuit of unbiased, transparent, and ethical machine learning solutions for cyber-social health.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cyberbullying is a widespread adverse phenomenon among online social
interactions in today's digital society. While numerous computational studies
focus on enhancing the cyberbullying detection performance of machine learning
algorithms, proposed models tend to carry and reinforce unintended social
biases. In this study, we try to answer the research question of "Can we
mitigate the unintended bias of cyberbullying detection models by guiding the
model training with fairness constraints?". For this purpose, we propose a
model training scheme that can employ fairness constraints and validate our
approach with different datasets. We demonstrate that various types of
unintended biases can be successfully mitigated without impairing the model
quality. We believe our work contributes to the pursuit of unbiased,
transparent, and ethical machine learning solutions for cyber-social health.
Related papers
- Verification of Machine Unlearning is Fragile [48.71651033308842]
We introduce two novel adversarial unlearning processes capable of circumventing both types of verification strategies.
This study highlights the vulnerabilities and limitations in machine unlearning verification, paving the way for further research into the safety of machine unlearning.
arXiv Detail & Related papers (2024-08-01T21:37:10Z) - The Fairness Stitch: Unveiling the Potential of Model Stitching in
Neural Network De-Biasing [0.043512163406552]
This study introduces a novel method called "The Fairness Stitch" to enhance fairness in deep learning models.
We conduct a comprehensive evaluation of two well-known datasets, CelebA and UTKFace.
Our findings reveal a notable improvement in achieving a balanced trade-off between fairness and performance.
arXiv Detail & Related papers (2023-11-06T21:14:37Z) - Designing an attack-defense game: how to increase robustness of
financial transaction models via a competition [69.08339915577206]
Given the escalating risks of malicious attacks in the finance sector, understanding adversarial strategies and robust defense mechanisms for machine learning models is critical.
We aim to investigate the current state and dynamics of adversarial attacks and defenses for neural network models that use sequential financial data as the input.
We have designed a competition that allows realistic and detailed investigation of problems in modern financial transaction data.
The participants compete directly against each other, so possible attacks and defenses are examined in close-to-real-life conditions.
arXiv Detail & Related papers (2023-08-22T12:53:09Z) - Session-based Cyberbullying Detection in Social Media: A Survey [16.39344929765961]
We define the Session-based Cyberbullying Detection framework that encapsulates the different steps and challenges of the problem.
Our review leads us to propose evidence-based criteria for a set of best practices to create session-based cyberbullying datasets.
arXiv Detail & Related papers (2022-07-14T18:56:54Z) - A Framework for Understanding Model Extraction Attack and Defense [48.421636548746704]
We study tradeoffs between model utility from a benign user's view and privacy from an adversary's view.
We develop new metrics to quantify such tradeoffs, analyze their theoretical properties, and develop an optimization problem to understand the optimal adversarial attack and defense strategies.
arXiv Detail & Related papers (2022-06-23T05:24:52Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Fairness-Aware Online Personalization [16.320648868892526]
We present a study of fairness in online personalization settings involving the ranking of individuals.
We first demonstrate that online personalization can cause the model to learn to act in an unfair manner if the user is biased in his/her responses.
We then formulate the problem of learning personalized models under fairness constraints and present a regularization based approach for mitigating biases in machine learning.
arXiv Detail & Related papers (2020-07-30T07:16:17Z) - Aggressive, Repetitive, Intentional, Visible, and Imbalanced: Refining
Representations for Cyberbullying Classification [4.945634077636197]
We study the nuanced problem of cyberbullying using five explicit factors to represent its social and linguistic aspects.
These results demonstrate the importance of representing and modeling cyberbullying as a social phenomenon.
arXiv Detail & Related papers (2020-04-04T00:35:16Z) - FairALM: Augmented Lagrangian Method for Training Fair Models with
Little Regret [42.66567001275493]
It is now accepted that because of biases in the datasets we present to the models, a fairness-oblivious training will lead to unfair models.
Here, we study mechanisms that impose fairness concurrently while training the model.
arXiv Detail & Related papers (2020-04-03T03:18:53Z) - Plausible Counterfactuals: Auditing Deep Learning Classifiers with
Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data.
Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model.
Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.