Related papers: Contexts Matter: An Empirical Study on Contextual Influence in Fairness Testing for Deep Learning Systems

Contexts Matter: An Empirical Study on Contextual Influence in Fairness Testing for Deep Learning Systems

URL: http://arxiv.org/abs/2408.06102v1
Date: Mon, 12 Aug 2024 12:36:06 GMT
Title: Contexts Matter: An Empirical Study on Contextual Influence in Fairness Testing for Deep Learning Systems
Authors: Chengwen Du, Tao Chen,
Abstract summary: We aim to understand how varying contexts affect fairness testing outcomes. Our results show that different context types and settings generally lead to a significant impact on the testing.
Score: 3.077531983369872
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Background: Fairness testing for deep learning systems has been becoming increasingly important. However, much work assumes perfect context and conditions from the other parts: well-tuned hyperparameters for accuracy; rectified bias in data, and mitigated bias in the labeling. Yet, these are often difficult to achieve in practice due to their resource-/labour-intensive nature. Aims: In this paper, we aim to understand how varying contexts affect fairness testing outcomes. Method:We conduct an extensive empirical study, which covers $10,800$ cases, to investigate how contexts can change the fairness testing result at the model level against the existing assumptions. We also study why the outcomes were observed from the lens of correlation/fitness landscape analysis. Results: Our results show that different context types and settings generally lead to a significant impact on the testing, which is mainly caused by the shifts of the fitness landscape under varying contexts. Conclusions: Our findings provide key insights for practitioners to evaluate the test generators and hint at future research directions.

Related papers

Fairness Evaluation with Item Response Theory [10.871079276188649]
This paper proposes a novel Fair-IRT framework to evaluate fairness in Machine Learning (ML) models. Detailed explanations for item characteristic curves (ICCs) are provided for particular individuals. Experiments demonstrate the effectiveness of this framework as a fairness evaluation tool.
arXiv Detail & Related papers (2024-10-20T22:25:20Z)
Rethinking Fair Representation Learning for Performance-Sensitive Tasks [19.40265690963578]
We use causal reasoning to define and formalise different sources of dataset bias. We run experiments across a range of medical modalities to examine the performance of fair representation learning under distribution shifts.
arXiv Detail & Related papers (2024-10-05T11:01:16Z)
Most Influential Subset Selection: Challenges, Promises, and Beyond [9.479235005673683]
We study the Most Influential Subset Selection (MISS) problem, which aims to identify a subset of training samples with the greatest collective influence. We conduct a comprehensive analysis of the prevailing approaches in MISS, elucidating their strengths and weaknesses. We demonstrate that an adaptive version of theses which applies them iteratively, can effectively capture the interactions among samples.
arXiv Detail & Related papers (2024-09-25T20:00:23Z)
Practical Guide for Causal Pathways and Sub-group Disparity Analysis [1.8974791957167259]
We use causal disparity analysis to quantify and examine the causal interplay between sensitive attributes and outcomes. Our two-step investigation focuses on datasets where race serves as the sensitive attribute. We demonstrate that the sub-groups identified by our approach to be affected the most by disparities are the ones with the largest ML classification errors.
arXiv Detail & Related papers (2024-07-02T22:51:01Z)
Fairness-guided Few-shot Prompting for Large Language Models [93.05624064699965]
In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats. We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes. We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
arXiv Detail & Related papers (2023-03-23T12:28:25Z)
Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem. We examine the performance of various debiasing methods across multiple tasks. We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z)
Conditional Supervised Contrastive Learning for Fair Text Classification [59.813422435604025]
We study learning fair representations that satisfy a notion of fairness known as equalized odds for text classification via contrastive learning. Specifically, we first theoretically analyze the connections between learning representations with a fairness constraint and conditional supervised contrastive objectives.
arXiv Detail & Related papers (2022-05-23T17:38:30Z)
SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data. We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z)
Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious. We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z)
Through the Data Management Lens: Experimental Analysis and Evaluation of Fair Classification [75.49600684537117]
Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness. We contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability. Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance.
arXiv Detail & Related papers (2021-01-18T22:55:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.