Related papers: Provable Fairness for Neural Network Models using Formal Verification

Provable Fairness for Neural Network Models using Formal Verification

URL: http://arxiv.org/abs/2212.08578v1
Date: Fri, 16 Dec 2022 16:54:37 GMT
Title: Provable Fairness for Neural Network Models using Formal Verification
Authors: Giorgian Borca-Tasciuc, Xingzhi Guo, Stanley Bak, Steven Skiena
Abstract summary: We propose techniques to emphprove fairness using recently developed formal methods that verify properties of neural network models. We show that through proper training, we can reduce unfairness by an average of 65.4% at a cost of less than 1% in AUC score.
Score: 10.90121002896312
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning models are increasingly deployed for critical decision-making tasks, making it important to verify that they do not contain gender or racial biases picked up from training data. Typical approaches to achieve fairness revolve around efforts to clean or curate training data, with post-hoc statistical evaluation of the fairness of the model on evaluation data. In contrast, we propose techniques to \emph{prove} fairness using recently developed formal methods that verify properties of neural network models.Beyond the strength of guarantee implied by a formal proof, our methods have the advantage that we do not need explicit training or evaluation data (which is often proprietary) in order to analyze a given trained model. In experiments on two familiar datasets in the fairness literature (COMPAS and ADULTS), we show that through proper training, we can reduce unfairness by an average of 65.4\% at a cost of less than 1\% in AUC score.

Related papers

Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
Data vs. Model Machine Learning Fairness Testing: An Empirical Study [23.535630175567146]
We take the first steps towards evaluating a more holistic approach by testing for fairness both before and after model training. We evaluate the effectiveness of the proposed approach using an empirical analysis of the relationship between model dependent and independent fairness metrics. Our results indicate that testing for fairness prior to training can be a cheap'' and effective means of catching a biased data collection process early.
arXiv Detail & Related papers (2024-01-15T14:14:16Z)
Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios. Existing debiasing methods suffer from high costs in bias labeling or model re-training. We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z)
Tools for Verifying Neural Models' Training Data [29.322899317216407]
"Proof-of-Training-Data" allows a model trainer to convince a Verifier of the training data that produced a set of model weights. We show experimentally that our verification procedures can catch a wide variety of attacks.
arXiv Detail & Related papers (2023-07-02T23:27:00Z)
Effective Robustness against Natural Distribution Shifts for Models with Different Training Data [113.21868839569]
"Effective robustness" measures the extra out-of-distribution robustness beyond what can be predicted from the in-distribution (ID) performance. We propose a new evaluation metric to evaluate and compare the effective robustness of models trained on different data.
arXiv Detail & Related papers (2023-02-02T19:28:41Z)
fAux: Testing Individual Fairness via Gradient Alignment [2.5329739965085785]
We describe a new approach for testing individual fairness that does not have either requirement. We show that the proposed method effectively identifies discrimination on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-10-10T21:27:20Z)
FairIF: Boosting Fairness in Deep Learning via Influence Functions with Validation Set Sensitive Attributes [51.02407217197623]
We propose a two-stage training algorithm named FAIRIF. It minimizes the loss over the reweighted data set where the sample weights are computed. We show that FAIRIF yields models with better fairness-utility trade-offs against various types of bias.
arXiv Detail & Related papers (2022-01-15T05:14:48Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews. We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z)
Efficient training of lightweight neural networks using Online Self-Acquired Knowledge Distillation [51.66271681532262]
Online Self-Acquired Knowledge Distillation (OSAKD) is proposed, aiming to improve the performance of any deep neural model in an online manner. We utilize k-nn non-parametric density estimation technique for estimating the unknown probability distributions of the data samples in the output feature space.
arXiv Detail & Related papers (2021-08-26T14:01:04Z)
Fairness in the Eyes of the Data: Certifying Machine-Learning Models [38.09830406613629]
We present a framework that allows to certify the fairness degree of a model based on an interactive and privacy-preserving test. We tackle two scenarios, where either the test data is privately available only to the tester or is publicly known in advance, even to the model creator. We provide a cryptographic technique to automate fairness testing and certified inference with only black-box access to the model at hand while hiding the participants' sensitive data.
arXiv Detail & Related papers (2020-09-03T09:22:39Z)
FR-Train: A Mutual Information-Based Approach to Fair and Robust Training [33.385118640843416]
We propose FR-Train, which holistically performs fair and robust model training. In our experiments, FR-Train shows almost no decrease in fairness and accuracy in the presence of data poisoning.
arXiv Detail & Related papers (2020-02-24T13:37:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.