Learning Unbiased Representations via Mutual Information Backpropagation
- URL: http://arxiv.org/abs/2003.06430v1
- Date: Fri, 13 Mar 2020 18:06:31 GMT
- Title: Learning Unbiased Representations via Mutual Information Backpropagation
- Authors: Ruggero Ragonesi, Riccardo Volpi, Jacopo Cavazza and Vittorio Murino
- Abstract summary: In particular, we face the case where some attributes (bias) of the data, if learned by the model, can severely compromise its generalization properties.
We propose a novel end-to-end optimization strategy, which simultaneously estimates and minimizes the mutual information between the learned representation and the data attributes.
- Score: 36.383338079229695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We are interested in learning data-driven representations that can generalize
well, even when trained on inherently biased data. In particular, we face the
case where some attributes (bias) of the data, if learned by the model, can
severely compromise its generalization properties. We tackle this problem
through the lens of information theory, leveraging recent findings for a
differentiable estimation of mutual information. We propose a novel end-to-end
optimization strategy, which simultaneously estimates and minimizes the mutual
information between the learned representation and the data attributes. When
applied on standard benchmarks, our model shows comparable or superior
classification performance with respect to state-of-the-art approaches.
Moreover, our method is general enough to be applicable to the problem of
``algorithmic fairness'', with competitive results.
Related papers
- Restoring balance: principled under/oversampling of data for optimal classification [0.0]
Class imbalance in real-world data poses a common bottleneck for machine learning tasks.
Mitigation strategies, such as under or oversampling the data depending on their abundances, are routinely proposed and tested empirically.
We provide a sharp prediction of the effects of under/oversampling strategies depending on class imbalance, first and second moments of the data, and the metrics of performance considered.
arXiv Detail & Related papers (2024-05-15T17:45:34Z) - Memory Consistency Guided Divide-and-Conquer Learning for Generalized
Category Discovery [56.172872410834664]
Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning.
We propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL)
Our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition.
arXiv Detail & Related papers (2024-01-24T09:39:45Z) - Correcting Underrepresentation and Intersectional Bias for Classification [49.1574468325115]
We consider the problem of learning from data corrupted by underrepresentation bias.
We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out rates.
We show that our algorithm permits efficient learning for model classes of finite VC dimension.
arXiv Detail & Related papers (2023-06-19T18:25:44Z) - Bias-inducing geometries: an exactly solvable data model with fairness
implications [13.690313475721094]
We introduce an exactly solvable high-dimensional model of data imbalance.
We analytically unpack the typical properties of learning models trained in this synthetic framework.
We obtain exact predictions for the observables that are commonly employed for fairness assessment.
arXiv Detail & Related papers (2022-05-31T16:27:57Z) - Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Regularizing Models via Pointwise Mutual Information for Named Entity
Recognition [17.767466724342064]
We propose a Pointwise Mutual Information (PMI) to enhance generalization ability while outperforming an in-domain performance.
Our approach enables to debias highly correlated word and labels in the benchmark datasets.
For long-named and complex-structure entities, our method can predict these entities through debiasing on conjunction or special characters.
arXiv Detail & Related papers (2021-04-15T05:47:27Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.