MMD-B-Fair: Learning Fair Representations with Statistical Testing
- URL: http://arxiv.org/abs/2211.07907v3
- Date: Tue, 25 Apr 2023 11:56:09 GMT
- Title: MMD-B-Fair: Learning Fair Representations with Statistical Testing
- Authors: Namrata Deka and Danica J. Sutherland
- Abstract summary: We introduce a method, MMD-B-Fair, to learn fair representations of data via kernel two-sample testing.
We find neural features of our data where a maximum mean discrepancy (MMD) test cannot distinguish between representations of different sensitive groups, while preserving information about the target attributes.
- Score: 4.669892068997491
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a method, MMD-B-Fair, to learn fair representations of data via
kernel two-sample testing. We find neural features of our data where a maximum
mean discrepancy (MMD) test cannot distinguish between representations of
different sensitive groups, while preserving information about the target
attributes. Minimizing the power of an MMD test is more difficult than
maximizing it (as done in previous work), because the test threshold's complex
behavior cannot be simply ignored. Our method exploits the simple asymptotics
of block testing schemes to efficiently find fair representations without
requiring complex adversarial optimization or generative modelling schemes
widely used by existing work on fair representation learning. We evaluate our
approach on various datasets, showing its ability to ``hide'' information about
sensitive attributes, and its effectiveness in downstream transfer tasks.
Related papers
- A Unified Data Representation Learning for Non-parametric Two-sample Testing [50.27067977793069]
We propose a representation-learning two-sample testing (RL-TST) framework.<n> RL-TST first performs purely self-supervised representation learning on the entire dataset.<n>A discriminative model is then trained on these IRs to learn discriminative representations (DRs)
arXiv Detail & Related papers (2024-11-30T23:23:52Z) - Dissecting Misalignment of Multimodal Large Language Models via Influence Function [12.832792175138241]
We introduce the Extended Influence Function for Contrastive Loss (ECIF), an influence function crafted for contrastive loss.
ECIF considers both positive and negative samples and provides a closed-form approximation of contrastive learning models.
Building upon ECIF, we develop a series of algorithms for data evaluation in MLLM, misalignment detection, and misprediction trace-back tasks.
arXiv Detail & Related papers (2024-11-18T15:45:41Z) - Unsupervised Transfer Learning via Adversarial Contrastive Training [3.227277661633986]
We propose a novel unsupervised transfer learning approach using adversarial contrastive training (ACT)
Our experimental results demonstrate outstanding classification accuracy with both fine-tuned linear probe and K-NN protocol across various datasets.
arXiv Detail & Related papers (2024-08-16T05:11:52Z) - Provable Optimization for Adversarial Fair Self-supervised Contrastive Learning [49.417414031031264]
This paper studies learning fair encoders in a self-supervised learning setting.
All data are unlabeled and only a small portion of them are annotated with sensitive attributes.
arXiv Detail & Related papers (2024-06-09T08:11:12Z) - Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Mutual Information Regularization for Weakly-supervised RGB-D Salient
Object Detection [33.210575826086654]
We present a weakly-supervised RGB-D salient object detection model via supervision.
We focus on effective multimodal representation learning via inter-modal mutual information regularization.
arXiv Detail & Related papers (2023-06-06T12:36:57Z) - Exploring the Boundaries of Semi-Supervised Facial Expression Recognition using In-Distribution, Out-of-Distribution, and Unconstrained Data [23.4909421082857]
We present a study on 11 of the most recent semi-supervised methods, in the context of facial expression recognition (FER)
Our investigation covers semi-supervised learning from in-distribution, out-of-distribution, unconstrained, and very small unlabelled data.
With an equal number of labelled samples, semi-supervised learning delivers a considerable improvement over supervised learning.
arXiv Detail & Related papers (2023-06-02T01:40:08Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - Improving LIME Robustness with Smarter Locality Sampling [0.0]
We propose to make LIME more robust by training a generative adversarial network to sample more realistic synthetic data.
Our experiments demonstrate an increase in accuracy across three real-world datasets in detecting biased, adversarial behavior.
This is achieved while maintaining comparable explanation quality, with up to 99.94% in top-1 accuracy in some cases.
arXiv Detail & Related papers (2020-06-22T14:36:08Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.