Information-Theoretic Testing and Debugging of Fairness Defects in Deep
Neural Networks
- URL: http://arxiv.org/abs/2304.04199v1
- Date: Sun, 9 Apr 2023 09:16:27 GMT
- Title: Information-Theoretic Testing and Debugging of Fairness Defects in Deep
Neural Networks
- Authors: Verya Monjezi and Ashutosh Trivedi and Gang Tan and Saeid Tizpaz-Niari
- Abstract summary: Deep feedforward neural networks (DNNs) are increasingly deployed in socioeconomic critical decision support software systems.
We present DICE: an information-theoretic testing and debug framework to discover and localize fairness defects in DNNs.
We show that DICE efficiently characterizes the amounts of discrimination, effectively generates discriminatory instances, and localizes layers/neurons with significant biases.
- Score: 13.425444923812586
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The deep feedforward neural networks (DNNs) are increasingly deployed in
socioeconomic critical decision support software systems. DNNs are
exceptionally good at finding minimal, sufficient statistical patterns within
their training data. Consequently, DNNs may learn to encode decisions --
amplifying existing biases or introducing new ones -- that may disadvantage
protected individuals/groups and may stand to violate legal protections. While
the existing search based software testing approaches have been effective in
discovering fairness defects, they do not supplement these defects with
debugging aids -- such as severity and causal explanations -- crucial to help
developers triage and decide on the next course of action. Can we measure the
severity of fairness defects in DNNs? Are these defects symptomatic of improper
training or they merely reflect biases present in the training data? To answer
such questions, we present DICE: an information-theoretic testing and debugging
framework to discover and localize fairness defects in DNNs.
The key goal of DICE is to assist software developers in triaging fairness
defects by ordering them by their severity. Towards this goal, we quantify
fairness in terms of protected information (in bits) used in decision making. A
quantitative view of fairness defects not only helps in ordering these defects,
our empirical evaluation shows that it improves the search efficiency due to
resulting smoothness of the search space. Guided by the quantitative fairness,
we present a causal debugging framework to localize inadequately trained layers
and neurons responsible for fairness defects. Our experiments over ten DNNs,
developed for socially critical tasks, show that DICE efficiently characterizes
the amounts of discrimination, effectively generates discriminatory instances,
and localizes layers/neurons with significant biases.
Related papers
- NeuFair: Neural Network Fairness Repair with Dropout [19.49034966552718]
This paper investigates neuron dropout as a post-processing bias mitigation for deep neural networks (DNNs)
We show that our design of randomized algorithms is effective and efficient in improving fairness (up to 69%) with minimal or no model performance degradation.
arXiv Detail & Related papers (2024-07-05T05:45:34Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - MAPPING: Debiasing Graph Neural Networks for Fair Node Classification
with Limited Sensitive Information Leakage [1.8238848494579714]
We propose a novel model-agnostic debiasing framework named MAPPING for fair node classification.
Our results show that MAPPING can achieve better trade-offs between utility and fairness, and privacy risks of sensitive information leakage.
arXiv Detail & Related papers (2024-01-23T14:59:46Z) - FAIRER: Fairness as Decision Rationale Alignment [23.098752318439782]
Deep neural networks (DNNs) have made significant progress, but often suffer from fairness issues.
It is unclear how the trained network makes a fair prediction, which limits future fairness improvements.
We propose gradient-guided parity alignment, which encourages gradient-weighted consistency of neurons across subgroups.
arXiv Detail & Related papers (2023-06-27T08:37:57Z) - DeepVigor: Vulnerability Value Ranges and Factors for DNNs' Reliability
Assessment [1.189955933770711]
Deep Neural Networks (DNNs) and their accelerators are being deployed more frequently in safety-critical applications.
We propose a novel accurate, fine-grain, metric-oriented, and accelerator-agnostic method called DeepVigor.
arXiv Detail & Related papers (2023-03-13T08:55:10Z) - The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural
Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property.
We propose a novel approach that returns the exact count of violations.
We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z) - Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks.
This work proposes a data selection strategy to be applied in the mini-batch training.
The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Accelerating Robustness Verification of Deep Neural Networks Guided by
Target Labels [8.9960048245668]
Deep Neural Networks (DNNs) have become key components of many safety-critical applications such as autonomous driving and medical diagnosis.
DNNs suffer from poor robustness because of their susceptibility to adversarial examples such that small perturbations to an input result in misprediction.
We propose a novel approach that can accelerate the robustness verification techniques by guiding the verification with target labels.
arXiv Detail & Related papers (2020-07-16T00:51:52Z) - Fairness Through Robustness: Investigating Robustness Disparity in Deep
Learning [61.93730166203915]
We argue that traditional notions of fairness are not sufficient when the model is vulnerable to adversarial attacks.
We show that measuring robustness bias is a challenging task for DNNs and propose two methods to measure this form of bias.
arXiv Detail & Related papers (2020-06-17T22:22:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.