De-biased Representation Learning for Fairness with Unreliable Labels
- URL: http://arxiv.org/abs/2208.00651v1
- Date: Mon, 1 Aug 2022 07:16:40 GMT
- Title: De-biased Representation Learning for Fairness with Unreliable Labels
- Authors: Yixuan Zhang, Feng Zhou, Zhidong Li, Yang Wang, Fang Chen
- Abstract summary: We propose a textbfDe-textbfBiased textbfRepresentation Learning for textbfFairness (DBRF) framework.
We formulate the de-biased learning framework through information-theoretic concepts such as mutual information and information bottleneck.
Experiment results over both synthetic and real-world data demonstrate that DBRF effectively learns de-biased representations towards ideal labels.
- Score: 22.794504690957414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Removing bias while keeping all task-relevant information is challenging for
fair representation learning methods since they would yield random or
degenerate representations w.r.t. labels when the sensitive attributes
correlate with labels. Existing works proposed to inject the label information
into the learning procedure to overcome such issues. However, the assumption
that the observed labels are clean is not always met. In fact, label bias is
acknowledged as the primary source inducing discrimination. In other words, the
fair pre-processing methods ignore the discrimination encoded in the labels
either during the learning procedure or the evaluation stage. This
contradiction puts a question mark on the fairness of the learned
representations. To circumvent this issue, we explore the following question:
\emph{Can we learn fair representations predictable to latent ideal fair labels
given only access to unreliable labels?} In this work, we propose a
\textbf{D}e-\textbf{B}iased \textbf{R}epresentation Learning for
\textbf{F}airness (DBRF) framework which disentangles the sensitive information
from non-sensitive attributes whilst keeping the learned representations
predictable to ideal fair labels rather than observed biased ones. We formulate
the de-biased learning framework through information-theoretic concepts such as
mutual information and information bottleneck. The core concept is that DBRF
advocates not to use unreliable labels for supervision when sensitive
information benefits the prediction of unreliable labels. Experiment results
over both synthetic and real-world data demonstrate that DBRF effectively
learns de-biased representations towards ideal labels.
Related papers
- You can't handle the (dirty) truth: Data-centric insights improve pseudo-labeling [60.27812493442062]
We show the importance of investigating labeled data quality to improve any pseudo-labeling method.
Specifically, we introduce a novel data characterization and selection framework called DIPS to extend pseudo-labeling.
We demonstrate the applicability and impact of DIPS for various pseudo-labeling methods across an extensive range of real-world datasets.
arXiv Detail & Related papers (2024-06-19T17:58:40Z) - Mitigating Label Bias via Decoupled Confident Learning [14.001915875687862]
Growing concerns regarding algorithmic fairness have led to a surge in methodologies to mitigate algorithmic bias.
bias in labels is pervasive across important domains, including healthcare, hiring, and content moderation.
We propose a pruning method -- Decoupled Confident Learning (DeCoLe) -- specifically designed to mitigate label bias.
arXiv Detail & Related papers (2023-07-18T03:28:03Z) - Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - Fairness and Bias in Truth Discovery Algorithms: An Experimental
Analysis [7.575734557466221]
Crowd workers may sometimes provide unreliable labels.
Truth discovery (TD) algorithms are applied to determine the consensus labels from conflicting worker responses.
We conduct a systematic study of the bias and fairness of TD algorithms.
arXiv Detail & Related papers (2023-04-25T04:56:35Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Bias-Tolerant Fair Classification [20.973916494320246]
label bias and selection bias are two reasons in data that will hinder the fairness of machine-learning outcomes.
We propose a Bias-TolerantFAirRegularizedLoss (B-FARL) which tries to regain the benefits using data affected by label bias and selection bias.
B-FARL takes the biased data as input, calls a model that approximates the one trained with fair but latent data, and thus prevents discrimination without constraints required.
arXiv Detail & Related papers (2021-07-07T13:31:38Z) - Instance Correction for Learning with Open-set Noisy Labels [145.06552420999986]
We use the sample selection approach to handle open-set noisy labels.
The discarded data are seen to be mislabeled and do not participate in training.
We modify the instances of discarded data to make predictions for the discarded data consistent with given labels.
arXiv Detail & Related papers (2021-06-01T13:05:55Z) - Exploiting Context for Robustness to Label Noise in Active Learning [47.341705184013804]
We address the problems of how a system can identify which of the queried labels are wrong and how a multi-class active learning system can be adapted to minimize the negative impact of label noise.
We construct a graphical representation of the unlabeled data to encode these relationships and obtain new beliefs on the graph when noisy labels are available.
This is demonstrated in three different applications: scene classification, activity classification, and document classification.
arXiv Detail & Related papers (2020-10-18T18:59:44Z) - Debiased Contrastive Learning [64.98602526764599]
We develop a debiased contrastive objective that corrects for the sampling of same-label datapoints.
Empirically, the proposed objective consistently outperforms the state-of-the-art for representation learning in vision, language, and reinforcement learning benchmarks.
arXiv Detail & Related papers (2020-07-01T04:25:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.