Provable Fairness for Neural Network Models using Formal Verification
- URL: http://arxiv.org/abs/2212.08578v1
- Date: Fri, 16 Dec 2022 16:54:37 GMT
- Title: Provable Fairness for Neural Network Models using Formal Verification
- Authors: Giorgian Borca-Tasciuc, Xingzhi Guo, Stanley Bak, Steven Skiena
- Abstract summary: We propose techniques to emphprove fairness using recently developed formal methods that verify properties of neural network models.
We show that through proper training, we can reduce unfairness by an average of 65.4% at a cost of less than 1% in AUC score.
- Score: 10.90121002896312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are increasingly deployed for critical
decision-making tasks, making it important to verify that they do not contain
gender or racial biases picked up from training data. Typical approaches to
achieve fairness revolve around efforts to clean or curate training data, with
post-hoc statistical evaluation of the fairness of the model on evaluation
data. In contrast, we propose techniques to \emph{prove} fairness using
recently developed formal methods that verify properties of neural network
models.Beyond the strength of guarantee implied by a formal proof, our methods
have the advantage that we do not need explicit training or evaluation data
(which is often proprietary) in order to analyze a given trained model. In
experiments on two familiar datasets in the fairness literature (COMPAS and
ADULTS), we show that through proper training, we can reduce unfairness by an
average of 65.4\% at a cost of less than 1\% in AUC score.
Related papers
- Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Data vs. Model Machine Learning Fairness Testing: An Empirical Study [23.535630175567146]
We take the first steps towards evaluating a more holistic approach by testing for fairness both before and after model training.
We evaluate the effectiveness of the proposed approach using an empirical analysis of the relationship between model dependent and independent fairness metrics.
Our results indicate that testing for fairness prior to training can be a cheap'' and effective means of catching a biased data collection process early.
arXiv Detail & Related papers (2024-01-15T14:14:16Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Tools for Verifying Neural Models' Training Data [29.322899317216407]
"Proof-of-Training-Data" allows a model trainer to convince a Verifier of the training data that produced a set of model weights.
We show experimentally that our verification procedures can catch a wide variety of attacks.
arXiv Detail & Related papers (2023-07-02T23:27:00Z) - Effective Robustness against Natural Distribution Shifts for Models with
Different Training Data [113.21868839569]
"Effective robustness" measures the extra out-of-distribution robustness beyond what can be predicted from the in-distribution (ID) performance.
We propose a new evaluation metric to evaluate and compare the effective robustness of models trained on different data.
arXiv Detail & Related papers (2023-02-02T19:28:41Z) - fAux: Testing Individual Fairness via Gradient Alignment [2.5329739965085785]
We describe a new approach for testing individual fairness that does not have either requirement.
We show that the proposed method effectively identifies discrimination on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-10-10T21:27:20Z) - FairIF: Boosting Fairness in Deep Learning via Influence Functions with
Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF.
It minimizes the loss over the reweighted data set where the sample weights are computed.
We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - Efficient training of lightweight neural networks using Online
Self-Acquired Knowledge Distillation [51.66271681532262]
Online Self-Acquired Knowledge Distillation (OSAKD) is proposed, aiming to improve the performance of any deep neural model in an online manner.
We utilize k-nn non-parametric density estimation technique for estimating the unknown probability distributions of the data samples in the output feature space.
arXiv Detail & Related papers (2021-08-26T14:01:04Z) - Fairness in the Eyes of the Data: Certifying Machine-Learning Models [38.09830406613629]
We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test.
We tackle two scenarios, where either the test data is privately available only to the tester or is publicly known in advance, even to the model creator.
We provide a cryptographic technique to automate fairness testing and certified inference with only black-box access to the model at hand while hiding the participants' sensitive data.
arXiv Detail & Related papers (2020-09-03T09:22:39Z) - FR-Train: A Mutual Information-Based Approach to Fair and Robust
Training [33.385118640843416]
We propose FR-Train, which holistically performs fair and robust model training.
In our experiments, FR-Train shows almost no decrease in fairness and accuracy in the presence of data poisoning.
arXiv Detail & Related papers (2020-02-24T13:37:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.