Fairness Testing of Deep Image Classification with Adequacy Metrics
- URL: http://arxiv.org/abs/2111.08856v1
- Date: Wed, 17 Nov 2021 01:30:13 GMT
- Title: Fairness Testing of Deep Image Classification with Adequacy Metrics
- Authors: Peixin Zhang, Jingyi Wang, Jun Sun, Xinyu Wang
- Abstract summary: DeepFAIT is a systematic fairness testing framework specifically designed for deep image classification applications.
We have conducted experiments on widely adopted large-scale face recognition applications, i.e., VGGFace and FairFace.
- Score: 6.559515085944965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As deep image classification applications, e.g., face recognition, become
increasingly prevalent in our daily lives, their fairness issues raise more and
more concern. It is thus crucial to comprehensively test the fairness of these
applications before deployment. Existing fairness testing methods suffer from
the following limitations: 1) applicability, i.e., they are only applicable for
structured data or text without handling the high-dimensional and abstract
domain sampling in the semantic level for image classification applications; 2)
functionality, i.e., they generate unfair samples without providing testing
criterion to characterize the model's fairness adequacy. To fill the gap, we
propose DeepFAIT, a systematic fairness testing framework specifically designed
for deep image classification applications. DeepFAIT consists of several
important components enabling effective fairness testing of deep image
classification applications: 1) a neuron selection strategy to identify the
fairness-related neurons; 2) a set of multi-granularity adequacy metrics to
evaluate the model's fairness; 3) a test selection algorithm for fixing the
fairness issues efficiently. We have conducted experiments on widely adopted
large-scale face recognition applications, i.e., VGGFace and FairFace. The
experimental results confirm that our approach can effectively identify the
fairness-related neurons, characterize the model's fairness, and select the
most valuable test cases to mitigate the model's fairness issues.
Related papers
- Few-Shot Anomaly Detection via Category-Agnostic Registration Learning [65.64252994254268]
Most existing anomaly detection methods require a dedicated model for each category.
This paper proposes a novel few-shot anomaly detection framework.
It is the first FSAD method that requires no model fine-tuning for novel categories.
arXiv Detail & Related papers (2024-06-13T05:01:13Z) - DP-IQA: Utilizing Diffusion Prior for Blind Image Quality Assessment in the Wild [54.139923409101044]
We propose a novel IQA method called diffusion priors-based IQA (DP-IQA)
We use pre-trained stable diffusion as the backbone, extract multi-level features from the denoising U-Net, and decode them to estimate the image quality score.
We distill the knowledge in the above model into a CNN-based student model, significantly reducing the parameter to enhance applicability.
arXiv Detail & Related papers (2024-05-30T12:32:35Z) - Classes Are Not Equal: An Empirical Study on Image Recognition Fairness [100.36114135663836]
We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets.
Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize.
Data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification.
arXiv Detail & Related papers (2024-02-28T07:54:50Z) - Benchmarking the Fairness of Image Upsampling Methods [29.01986714656294]
We develop a set of metrics for performance and fairness of conditional generative models.
We benchmark their imbalances and diversity.
As part of the study, a subset of datasets replicates the racial distribution of common-scale face.
arXiv Detail & Related papers (2024-01-24T16:13:26Z) - Evaluating the Fairness of Discriminative Foundation Models in Computer
Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP)
We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy.
Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - FairAdaBN: Mitigating unfairness with adaptive batch normalization and
its application to dermatological disease classification [14.589159162086926]
We propose FairAdaBN, which makes batch normalization adaptive to sensitive attribute.
We propose a new metric, named Fairness-Accuracy Trade-off Efficiency (FATE), to compute normalized fairness improvement over accuracy drop.
Experiments on two dermatological datasets show that our proposed method outperforms other methods on fairness criteria and FATE.
arXiv Detail & Related papers (2023-03-15T02:22:07Z) - Enhancing Fairness of Visual Attribute Predictors [6.6424782986402615]
We introduce fairness-aware regularization losses based on batch estimates of Demographic Parity, Equalized Odds, and a novel Intersection-over-Union measure.
Our work is the first attempt to incorporate these types of losses in an end-to-end training scheme for mitigating biases of visual attribute predictors.
arXiv Detail & Related papers (2022-07-07T15:02:04Z) - Technical Challenges for Training Fair Neural Networks [62.466658247995404]
We conduct experiments on both facial recognition and automated medical diagnosis datasets using state-of-the-art architectures.
We observe that large models overfit to fairness objectives, and produce a range of unintended and undesirable consequences.
arXiv Detail & Related papers (2021-02-12T20:36:45Z) - Contraction Mapping of Feature Norms for Classifier Learning on the Data
with Different Quality [5.47982638565422]
We propose a contraction mapping function to compress the range of feature norms of training images according to their quality.
Experiments on various classification applications, including handwritten digit recognition, lung nodule classification, face verification and face recognition, demonstrate that the proposed approach is promising.
arXiv Detail & Related papers (2020-07-27T09:53:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.