MAFT: Efficient Model-Agnostic Fairness Testing for Deep Neural Networks via Zero-Order Gradient Search
- URL: http://arxiv.org/abs/2412.20086v1
- Date: Sat, 28 Dec 2024 09:07:06 GMT
- Title: MAFT: Efficient Model-Agnostic Fairness Testing for Deep Neural Networks via Zero-Order Gradient Search
- Authors: Zhaohui Wang, Min Zhang, Jingran Yang, Bojie Shao, Min Zhang,
- Abstract summary: We propose a novel black-box individual fairness testing method called Model-Agnostic Fairness Testing (MAFT)
We demonstrate that MAFT achieves the same effectiveness as state-of-the-art white-box methods whilst improving the applicability to large-scale networks.
- Score: 20.48306648223519
- License:
- Abstract: Deep neural networks (DNNs) have shown powerful performance in various applications and are increasingly being used in decision-making systems. However, concerns about fairness in DNNs always persist. Some efficient white-box fairness testing methods about individual fairness have been proposed. Nevertheless, the development of black-box methods has stagnated, and the performance of existing methods is far behind that of white-box methods. In this paper, we propose a novel black-box individual fairness testing method called Model-Agnostic Fairness Testing (MAFT). By leveraging MAFT, practitioners can effectively identify and address discrimination in DL models, regardless of the specific algorithm or architecture employed. Our approach adopts lightweight procedures such as gradient estimation and attribute perturbation rather than non-trivial procedures like symbol execution, rendering it significantly more scalable and applicable than existing methods. We demonstrate that MAFT achieves the same effectiveness as state-of-the-art white-box methods whilst improving the applicability to large-scale networks. Compared to existing black-box approaches, our approach demonstrates distinguished performance in discovering fairness violations w.r.t effectiveness (approximately 14.69 times) and efficiency (approximately 32.58 times).
Related papers
- Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement [69.51496713076253]
In this paper, we focus on the aforementioned efficiency aspects of existing MTL methods.
We first carry out large-scale experiments of the methods with smaller backbones and on a the MetaGraspNet dataset as a new test ground.
We also propose Feature Disentanglement measure as a novel and efficient identifier of the challenges in MTL.
arXiv Detail & Related papers (2024-02-05T22:15:55Z) - MFABA: A More Faithful and Accelerated Boundary-based Attribution Method
for Deep Neural Networks [69.28125286491502]
We introduce MFABA, an attribution algorithm that adheres to axioms.
Results demonstrate its superiority by achieving over 101.5142 times faster speed than the state-of-the-art attribution algorithms.
arXiv Detail & Related papers (2023-12-21T07:48:15Z) - Saliency strikes back: How filtering out high frequencies improves white-box explanations [15.328499301244708]
"White-box" methods rely on a gradient signal that is often contaminated by high-frequency artifacts.
We introduce a new approach called "FORGrad" to overcome this limitation.
Our findings show that FORGrad consistently enhances the performance of already existing white-box methods.
arXiv Detail & Related papers (2023-07-18T19:56:20Z) - Better Understanding Differences in Attribution Methods via Systematic Evaluations [57.35035463793008]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We use these evaluation schemes to study strengths and shortcomings of some widely used attribution methods over a wide range of models.
arXiv Detail & Related papers (2023-03-21T14:24:58Z) - Adaptive Fairness Improvement Based on Causality Analysis [5.827653543633839]
Given a discriminating neural network, the problem of fairness improvement is to systematically reduce discrimination without significantly scarifies its performance.
We propose an approach which adaptively chooses the fairness improving method based on causality analysis.
Our approach is effective (i.e., always identify the best fairness improving method) and efficient (i.e., with an average time overhead of 5 minutes)
arXiv Detail & Related papers (2022-09-15T10:05:31Z) - Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.
We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning.
We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Towards Better Understanding Attribution Methods [77.1487219861185]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We also propose a post-processing smoothing step that significantly improves the performance of some attribution methods.
arXiv Detail & Related papers (2022-05-20T20:50:17Z) - Automatic Fairness Testing of Neural Classifiers through Adversarial
Sampling [8.2868128804393]
We propose a scalable and effective approach for systematically searching for discriminative samples.
Compared with state-of-the-art methods, our approach only employs lightweight procedures like gradient computation and clustering.
The retrained models reduce discrimination by 57.2% and 60.2% respectively on average.
arXiv Detail & Related papers (2021-07-17T03:47:08Z) - Black-box Adversarial Sample Generation Based on Differential Evolution [18.82850158275813]
We propose a black-box technique to test the robustness of Deep Neural Networks (DNNs)
The technique does not require any knowledge of the structure or weights of the target DNN.
Experimental results show that our technique can achieve 100% success in generating adversarial samples to trigger misclassification.
arXiv Detail & Related papers (2020-07-30T08:43:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.